Kubernetes 服务与 kafka 无法按预期工作 [英] Kubernetes service not working as expected with kafka

查看:29
本文介绍了Kubernetes 服务与 kafka 无法按预期工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在共享命名空间中将 zookeeper 和 kafka 设置为单独的 Kubernetes 部署/pod.我已经在我的 Ubuntu 沙箱上通过 kubeadm 用 Calico 引导了本地 K8s 1.8...

I'm trying to setup a zookeeper and kafka as separate Kubernetes deployments/pods in a shared namespace. I've bootstraped a local K8s 1.8 with Calico via kubeadm on my Ubuntu sandbox...

对于 Zookeeper,我使用了 hub.docker.com 中的图像 zookeeper:3.4,我创建了一个 Kubernetes 部署和服务,我在其中公开端口:2181 2888 3888.服务名称是 zookeeper,我认为我应该是能够通过该主机名从命名空间中的 pod 中使用它.

For the Zookeeper, I'm using the image zookeeper:3.4 from hub.docker.com and I created a Kubernetes deployment and service, where I expose ports: 2181 2888 3888. Service name is zookeeper and I assume I should be able to use it by this hostname from the pods in the namespace.

对于 Kafka 1.0,我创建了自己的容器映像,我可以使用环境变量来控制它...我将 zookeeper.connect 设置为 zookeeper:2181.我假设 Kubernetes DNS 会解决这个问题并打开与服务的连接.

For the Kafka 1.0, I've created my own container image, that I can control with environment variables... I'm setting the zookeeper.connect to zookeeper:2181. I assume the Kubernetes DNS will resolve this and open the connection to the service.

不幸的是我得到:

[2018-01-03 15:48:26,292] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2018-01-03 15:48:32,293] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2018-01-03 15:48:46,286] INFO Opening socket connection to server zookeeper.sandbox.svc.cluster.local/10.107.41.148:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:48:46,299] INFO Socket connection established to zookeeper.sandbox.svc.cluster.local/10.107.41.148:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:48:46,319] INFO Session establishment complete on server zookeeper.sandbox.svc.cluster.local/10.107.41.148:2181, sessionid = 0x10000603c560001, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:48:46,331] INFO Session: 0x10000603c560001 closed (org.apache.zookeeper.ZooKeeper)
[2018-01-03 15:48:46,333] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server 'zookeeper:2181' with timeout of 6000 ms
    at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1233)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:157)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:131)
    at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:115)
    at kafka.utils.ZkUtils$.withMetrics(ZkUtils.scala:92)
    at kafka.server.KafkaServer.initZk(KafkaServer.scala:346)
    at kafka.server.KafkaServer.startup(KafkaServer.scala:194)
    at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
    at kafka.Kafka$.main(Kafka.scala:92)
    at kafka.Kafka.main(Kafka.scala)

所以我假设我的集群中有一个通用的网络问题,然后我注意到一些更令我困惑的事情......如果我将 zookeeper.connect 设置为 10.107.41.148:2181(zookeeper 服务的当前地址),连接有效(至少从 kafka 到 zookeeper).

So I was assuming I have a generic networking issue in my cluster, then I noticed something even more confusing for me... If I set zookeeper.connect to 10.107.41.148:2181 ( the current address of the zookeeper service ), the connection works ( at least from kafka to zookeeper ).

[2018-01-03 15:51:31,092] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2018-01-03 15:51:31,094] INFO Opening socket connection to server 10.107.41.148/10.107.41.148:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:51:31,105] INFO Socket connection established to 10.107.41.148/10.107.41.148:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:51:31,134] INFO Session establishment complete on server 10.107.41.148/10.107.41.148:2181, sessionid = 0x10000603c560005, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)

通过这个设置,我可以使用来自 kubernetes 集群主机的 zookeeper 服务来执行例如bin/kafka-topics.sh --list --zookeeper 10.107.41.148:2181".生成消息不起作用...我假设一旦网络正常工作,我需要添加 kafka 广告地址...

With this setup I'm able to use the zookeeper service from the host of the kubernetes cluster to do for example "bin/kafka-topics.sh --list --zookeeper 10.107.41.148:2181". Producing a message does not work thou... I assume once the network is properly working, I need to add the kafka advertised adddress ...

kafka-console-producer.sh --broker-list 10.100.117.196:9092 --topic test1
>test-msg1
>[2018-01-03 17:05:35,689] WARN [Producer clientId=console-producer] Connection to node 0 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)

是否有任何提示我的 Kubernetes 网络设置有什么问题,或者至少可以从哪里开始进行故障排除?

Any hints what is wrong with my Kubernetes network setup or at least where to start troubleshooting?

谢谢你,最好的问候,帕维尔

Thank you and best regards, Pavel

推荐答案

如果使用 statefulsets,需要先部署服务.

If you use statefulsets, you need to deploy the service first.

这是我的服务.

apiVersion: v1
kind: Service
metadata:
  name: zookeeper
  labels:
    app: zookeeper
spec:
  clusterIP: None
  ports:
  - port: 2181
    name: client
  - port: 2888
    name: server
  - port: 3888
    name: leader-election
  selector:
    app: zookeeper

这里是configmap(稍后使用):

Here the configmap (used later):

 apiVersion: v1
    kind: ConfigMap
    metadata:
      name: zookeeper-cm
    data:
      jvm.heap: "1G"
      tick: "2000"
      init: "10"
      sync: "5"
      client.cnxns: "60"
      snap.retain: "3"
      purge.interval: "0"

这里是我的 statefulset:

Here my statefulset:

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: zookeeper
spec:
  serviceName: zookeeper
  replicas: 1
  template:
    metadata:
      labels:
        app: zookeeper
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - zookeeper
              topologyKey: "kubernetes.io/hostname"
      containers:
      - name: zookeeper
        imagePullPolicy: IfNotPresent
        image: sorintdev/zookeeper:20171204m
        resources:
          requests:
            memory: 500Mi
            cpu: 200m
        ports:
        - containerPort: 2181
          name: client
        - containerPort: 2888
          name: server
        - containerPort: 3888
          name: leader-election
        env:
        - name : ZK_REPLICAS
          value: "1"
        - name : ZK_HEAP_SIZE
          valueFrom:
            configMapKeyRef:
                name: zookeeper-cm
                key: jvm.heap
        - name : ZK_TICK_TIME
          valueFrom:
            configMapKeyRef:
                name: zookeeper-cm
                key: tick
        - name : ZK_INIT_LIMIT
          valueFrom:
            configMapKeyRef:
                name: zookeeper-cm
                key: init
        - name : ZK_SYNC_LIMIT
          valueFrom:
            configMapKeyRef:
                name: zookeeper-cm
                key: tick
        - name : ZK_MAX_CLIENT_CNXNS
          valueFrom:
            configMapKeyRef:
                name: zookeeper-cm
                key: client.cnxns
        - name: ZK_SNAP_RETAIN_COUNT
          valueFrom:
            configMapKeyRef:
                name: zookeeper-cm
                key: snap.retain
        - name: ZK_PURGE_INTERVAL
          valueFrom:
            configMapKeyRef:
                name: zookeeper-cm
                key: purge.interval
        - name: ZK_CLIENT_PORT
          value: "2181"
        - name: ZK_SERVER_PORT
          value: "2888"
        - name: ZK_ELECTION_PORT
          value: "3888"
        command:
        - bash
        - -c
        - zkGenConfig.sh && zkServer.sh start-foreground
        readinessProbe:
          exec:
            command:
            - "zkOk.sh"
          initialDelaySeconds: 10
          timeoutSeconds: 5
        livenessProbe:
          exec:
            command:
            - "zkOk.sh"
          initialDelaySeconds: 10
          timeoutSeconds: 5
        volumeMounts:
        - name: data
          mountPath: /var/lib/zookeeper
      securityContext:
        runAsUser: 1000
        fsGroup: 1000
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      resources:
        requests:
          storage: 1Gi
      accessModes:
      - ReadWriteOnce
      storageClassName: zookeeper-class

在你部署了一个可用的 zookeeper 配置并且他们选择了一个 master 之后,你可以继续进行 kafka 部署.

After you deployed a working zookeeper configuration and they have elected a master, you can proceed with kafka deployment.

部署zookeeper后,您的kafka配置必须通过服务引用zookeeper statefulset.在 kafka 中,您必须定义/覆盖此属性:

Once zookeeper is deployed, your kafka configuration must reference to zookeeper statefulset through the service. In kafka you must define/override this property:

--override zookeeper.connect=zookeeper-0.zookeeper:2181

从 pod 内部,您应该ping zookeeper-0.zookeeper 成功.

From inside a pod you should ping zookeeper-0.zookeeper succesfully.

希望这会有所帮助.

这篇关于Kubernetes 服务与 kafka 无法按预期工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆