Kubernetes 服务与 kafka 无法按预期工作 [英] Kubernetes service not working as expected with kafka
问题描述
我正在尝试在共享命名空间中将 zookeeper 和 kafka 设置为单独的 Kubernetes 部署/pod.我已经在我的 Ubuntu 沙箱上通过 kubeadm 用 Calico 引导了本地 K8s 1.8...
I'm trying to setup a zookeeper and kafka as separate Kubernetes deployments/pods in a shared namespace. I've bootstraped a local K8s 1.8 with Calico via kubeadm on my Ubuntu sandbox...
对于 Zookeeper,我使用了 hub.docker.com 中的图像 zookeeper:3.4,我创建了一个 Kubernetes 部署和服务,我在其中公开端口:2181 2888 3888.服务名称是 zookeeper,我认为我应该是能够通过该主机名从命名空间中的 pod 中使用它.
For the Zookeeper, I'm using the image zookeeper:3.4 from hub.docker.com and I created a Kubernetes deployment and service, where I expose ports: 2181 2888 3888. Service name is zookeeper and I assume I should be able to use it by this hostname from the pods in the namespace.
对于 Kafka 1.0,我创建了自己的容器映像,我可以使用环境变量来控制它...我将 zookeeper.connect 设置为 zookeeper:2181.我假设 Kubernetes DNS 会解决这个问题并打开与服务的连接.
For the Kafka 1.0, I've created my own container image, that I can control with environment variables... I'm setting the zookeeper.connect to zookeeper:2181. I assume the Kubernetes DNS will resolve this and open the connection to the service.
不幸的是我得到:
[2018-01-03 15:48:26,292] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2018-01-03 15:48:32,293] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2018-01-03 15:48:46,286] INFO Opening socket connection to server zookeeper.sandbox.svc.cluster.local/10.107.41.148:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:48:46,299] INFO Socket connection established to zookeeper.sandbox.svc.cluster.local/10.107.41.148:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:48:46,319] INFO Session establishment complete on server zookeeper.sandbox.svc.cluster.local/10.107.41.148:2181, sessionid = 0x10000603c560001, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:48:46,331] INFO Session: 0x10000603c560001 closed (org.apache.zookeeper.ZooKeeper)
[2018-01-03 15:48:46,333] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server 'zookeeper:2181' with timeout of 6000 ms
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1233)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:157)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:131)
at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:115)
at kafka.utils.ZkUtils$.withMetrics(ZkUtils.scala:92)
at kafka.server.KafkaServer.initZk(KafkaServer.scala:346)
at kafka.server.KafkaServer.startup(KafkaServer.scala:194)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:92)
at kafka.Kafka.main(Kafka.scala)
所以我假设我的集群中有一个通用的网络问题,然后我注意到一些更令我困惑的事情......如果我将 zookeeper.connect 设置为 10.107.41.148:2181(zookeeper 服务的当前地址),连接有效(至少从 kafka 到 zookeeper).
So I was assuming I have a generic networking issue in my cluster, then I noticed something even more confusing for me... If I set zookeeper.connect to 10.107.41.148:2181 ( the current address of the zookeeper service ), the connection works ( at least from kafka to zookeeper ).
[2018-01-03 15:51:31,092] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2018-01-03 15:51:31,094] INFO Opening socket connection to server 10.107.41.148/10.107.41.148:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:51:31,105] INFO Socket connection established to 10.107.41.148/10.107.41.148:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:51:31,134] INFO Session establishment complete on server 10.107.41.148/10.107.41.148:2181, sessionid = 0x10000603c560005, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
通过这个设置,我可以使用来自 kubernetes 集群主机的 zookeeper 服务来执行例如bin/kafka-topics.sh --list --zookeeper 10.107.41.148:2181".生成消息不起作用...我假设一旦网络正常工作,我需要添加 kafka 广告地址...
With this setup I'm able to use the zookeeper service from the host of the kubernetes cluster to do for example "bin/kafka-topics.sh --list --zookeeper 10.107.41.148:2181". Producing a message does not work thou... I assume once the network is properly working, I need to add the kafka advertised adddress ...
kafka-console-producer.sh --broker-list 10.100.117.196:9092 --topic test1
>test-msg1
>[2018-01-03 17:05:35,689] WARN [Producer clientId=console-producer] Connection to node 0 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
是否有任何提示我的 Kubernetes 网络设置有什么问题,或者至少可以从哪里开始进行故障排除?
Any hints what is wrong with my Kubernetes network setup or at least where to start troubleshooting?
谢谢你,最好的问候,帕维尔
Thank you and best regards, Pavel
推荐答案
如果使用 statefulsets,需要先部署服务.
If you use statefulsets, you need to deploy the service first.
这是我的服务.
apiVersion: v1
kind: Service
metadata:
name: zookeeper
labels:
app: zookeeper
spec:
clusterIP: None
ports:
- port: 2181
name: client
- port: 2888
name: server
- port: 3888
name: leader-election
selector:
app: zookeeper
这里是configmap(稍后使用):
Here the configmap (used later):
apiVersion: v1
kind: ConfigMap
metadata:
name: zookeeper-cm
data:
jvm.heap: "1G"
tick: "2000"
init: "10"
sync: "5"
client.cnxns: "60"
snap.retain: "3"
purge.interval: "0"
这里是我的 statefulset:
Here my statefulset:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: zookeeper
spec:
serviceName: zookeeper
replicas: 1
template:
metadata:
labels:
app: zookeeper
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zookeeper
topologyKey: "kubernetes.io/hostname"
containers:
- name: zookeeper
imagePullPolicy: IfNotPresent
image: sorintdev/zookeeper:20171204m
resources:
requests:
memory: 500Mi
cpu: 200m
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
env:
- name : ZK_REPLICAS
value: "1"
- name : ZK_HEAP_SIZE
valueFrom:
configMapKeyRef:
name: zookeeper-cm
key: jvm.heap
- name : ZK_TICK_TIME
valueFrom:
configMapKeyRef:
name: zookeeper-cm
key: tick
- name : ZK_INIT_LIMIT
valueFrom:
configMapKeyRef:
name: zookeeper-cm
key: init
- name : ZK_SYNC_LIMIT
valueFrom:
configMapKeyRef:
name: zookeeper-cm
key: tick
- name : ZK_MAX_CLIENT_CNXNS
valueFrom:
configMapKeyRef:
name: zookeeper-cm
key: client.cnxns
- name: ZK_SNAP_RETAIN_COUNT
valueFrom:
configMapKeyRef:
name: zookeeper-cm
key: snap.retain
- name: ZK_PURGE_INTERVAL
valueFrom:
configMapKeyRef:
name: zookeeper-cm
key: purge.interval
- name: ZK_CLIENT_PORT
value: "2181"
- name: ZK_SERVER_PORT
value: "2888"
- name: ZK_ELECTION_PORT
value: "3888"
command:
- bash
- -c
- zkGenConfig.sh && zkServer.sh start-foreground
readinessProbe:
exec:
command:
- "zkOk.sh"
initialDelaySeconds: 10
timeoutSeconds: 5
livenessProbe:
exec:
command:
- "zkOk.sh"
initialDelaySeconds: 10
timeoutSeconds: 5
volumeMounts:
- name: data
mountPath: /var/lib/zookeeper
securityContext:
runAsUser: 1000
fsGroup: 1000
volumeClaimTemplates:
- metadata:
name: data
spec:
resources:
requests:
storage: 1Gi
accessModes:
- ReadWriteOnce
storageClassName: zookeeper-class
在你部署了一个可用的 zookeeper 配置并且他们选择了一个 master 之后,你可以继续进行 kafka 部署.
After you deployed a working zookeeper configuration and they have elected a master, you can proceed with kafka deployment.
部署zookeeper后,您的kafka配置必须通过服务引用zookeeper statefulset.在 kafka 中,您必须定义/覆盖此属性:
Once zookeeper is deployed, your kafka configuration must reference to zookeeper statefulset through the service. In kafka you must define/override this property:
--override zookeeper.connect=zookeeper-0.zookeeper:2181
从 pod 内部,您应该ping zookeeper-0.zookeeper
成功.
From inside a pod you should ping zookeeper-0.zookeeper
succesfully.
希望这会有所帮助.
这篇关于Kubernetes 服务与 kafka 无法按预期工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!