Kafka Connect集群设置或启动Connect Worker [英] Kafka connect cluster setup or launching connect workers

查看:1136
本文介绍了Kafka Connect集群设置或启动Connect Worker的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过kafka connect,并且正在尝试了解这些概念。

I am going through kafka connect, and i am trying to get the concepts.

让我们说我已经安装了kafka集群(节点k1,k2和k3)并正在运行,现在我想在不同的节点上运行kafka connect worker,说c1和

Let us say I have kafka cluster (nodes k1, k2 and k3) setup and it is running, now i want to run kafka connect workers in different nodes say c1 and c2 in distributed mode.

几个问题。

1)在分布式模式下运行或启动kafka connect需要使用命令 ../ bin / connect-distributed.sh ,该命令在kakfa群集节点中可用,因此我需要从任何一个kafka群集中启动kafka connect。节点?或从中启动kafka connect的任何节点都需要具有kafka二进制文件,以便我能够使用 ../ bin / connect-distributed.sh

1) To run or launch kafka connect in distributed mode I need to use command ../bin/connect-distributed.sh, which is available in kakfa cluster nodes, so I need to launch kafka connect from any one of the kafka cluster nodes? or any node from where I launch kafka connect needs to have kafka binaries so that i will be able to use ../bin/connect-distributed.sh

2)我需要将连接器插件复制到执行步骤1的任何kafka群集节点(或所有群集节点?)上吗?

2) I need to copy the my connector plugins to any kafka cluster node( or to all cluster nodes?) from where I do the step 1?

3)在工作节点上启动jvm进程之前,kafka如何将这些连接器插件复制到工作节点?因为该插件是具有我的任务代码的插件,因此需要将其复制到worker才能在worker中启动该进程。

3) how does kafka copies these connector plugins to worker node before starting jvm process on the worker node? because the plugin is the one which has my task code and it needs to be copied to worker in order to start the process in worker.

4)我需要安装吗?连接群集节点c1和c2中的任何内容,例如需要安装Java或任何与kafka connect相关的内容?

4) Do i need to install anything in connect cluster nodes c1 and c2, like need to install java or any kafka connect related?

5)在某些地方,它说使用融合平台,但我想

5) In some places it says use confluent platform but i would like to start it with apache kafka connect alone first.

请有人通过一些灯光甚至指向某些资源的指针也可以帮助您。

can some one please through some light or even pointer to some resources would also help.

谢谢。

推荐答案

1)为了拥有一个高度可用的kafka-connect服务,您需要在具有相同 group.id的两台不同机器上至少运行两个 connect-distributed.sh 实例。 。您可以在此处。为了提高性能,Connect应该独立于代理和Zookeeper计算机运行。

1) In order to have a highly available kafka-connect service you need to run at least two instances of connect-distributed.sh on two distinct machines that have the same group.id. You can find more details regarding the configuration of each worker here. For improved performance, Connect should be ran independently of the broker and Zookeeper machines.

2)是,您需要将所有连接器放置在 plugin.path 下(通常在<$ c $以下) c> / usr / share / java / )在计划运行kafka-connect的每台计算机上。

2) Yes, you need to place all your connectors under plugin.path (normally under /usr/share/java/) on every machine that you are planning to run kafka-connect.

3)kafka-connect将在启动时加载连接器。您无需处理。请注意,如果您的kafka-connect实例正在运行,并且添加了新的连接器,则需要重新启动服务。

3) kafka-connect will load the connectors on startup. You don't need to handle this. Note that if your kafka-connect instance is running and a new connector is added, you need to restart the service.

4)您需要在所有计算机上安装Java。特别是对于Confluent平台:

4) You need to have Java installed on all your machines. For Confluent Platform particularly:


此版本的Confluent平台
支持Java 1.7和1.8(当前不支持Java 1.9 )。您应该使用
Garbage-First(G1)垃圾收集器来运行。
有关详细信息,请参阅
支持的版本和互操作性

5)这取决于。 Confluent由Apache Kafka的原始创建者创建,它是一个更完整的发行版,其中添加了架构管理,连接器和客户端。它也随KSQL一起提供,如果您需要对某些事件采取行动,这将非常有用。 Confluent只是在Apache Kafka发行版的基础上添加,它不是经过修改的版本。

5) It depends. Confluent was founded by the original creators of Apache Kafka and it comes as a more complete distribution adding schema management, connectors and clients. It also comes with KSQL which is quite useful if you need to act on certain events. Confluent simply adds on top of the Apache Kafka distribution, it's not a modified version.

这篇关于Kafka Connect集群设置或启动Connect Worker的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆