具有多个任务的分布式官方 Mongodb Kafka 源连接器不工作 [英] Distributed Official Mongodb Kafka Source Connector with Multiple tasks Not working

查看:24
本文介绍了具有多个任务的分布式官方 Mongodb Kafka 源连接器不工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的 Windows 机器上运行 Apache Kafka,有两个 Kafka-Connect-Workers(端口 8083、8084)和一个具有三个分区的主题(一个复制).我的问题是,每当我关闭其中一个工作人员时,我都能看到故障转移到其他 Kafka-Connect 工作人员,但没有发生负载平衡,因为任务数量始终为 1.我使用官方 MongoDB-Kafka-Connector 作为源(ChangeStream),tasks.max=6.我尝试用多个线程更新 MongoDB,以便它可以将更多数据推送到 Kafka-Connect 中,并且可能会让 Kafka-Connect 创建更多任务.即使在数据量更大的情况下,任务数仍为一.

I am running Apache Kafka on my Windows machine with two Kafka-Connect-Workers(Port 8083, 8084) and one topic with three partitions(replication of one). My issue is that I am able to see the fail-over to other Kafka-Connect worker whenever I shutdown one of them, but load balancing is not happening because the number of tasks is always ONE. I am using Official MongoDB-Kafka-Connector as Source(ChangeStream) with tasks.max=6. I tried updating MongoDB with multiple threads so that it could push more data into Kafka-Connect and may perhaps make Kafka-Connect create more tasks. Even under higher volume of data, tasks count remain one.

我如何确认只有一项任务正在运行?那是通过apihttp://localhost:8083/connectors/mongodb-connector/status":回复:<代码>{名称":mongodb-connector",连接器":{状态":正在运行",worker_id":xx.xx.xx.xx:8083";}任务":[{id":0,状态":正在运行"worker_id":xx.xx.xx.xx:8083"}],类型":来源"}我在这里错过了什么吗?为什么没有创建更多任务?

How I confirmed only one task is running? That's through the api "http://localhost:8083/connectors/mongodb-connector/status" : Response: { "name":"mongodb-connector", "connector": { "state":"RUNNING", "worker_id":"xx.xx.xx.xx:8083" } "tasks": [ { "id": 0, "state": "RUNNING" "worker_id": "xx.xx.xx.xx:8083" } ], "type": "source" } Am I missing something here? Why more tasks are not created?

推荐答案

这似乎是官方 MongoDB Kafka Source Connector 的行为.这是我在另一个论坛上从 Ross Lawley(MongoDB 开发人员)那里得到的答案:

It seems this is the behavior of Official MongoDB Kafka Source Connector. This is the answer I got on another forum from Ross Lawley(MongoDB developer):

在 1.2.0 之前,接收器连接器仅支持一个任务.Source 连接器仍然只支持单个任务,这是因为它使用单个 Change Stream 游标.这足以在集群范围、数据库范围或单个集合范围内观察和发布更改.

我提出了这张票:https://jira.mongodb.org/browse/KAFKA-121得到以下回复:源连接器只会产生一个任务.这是设计使然,因为源连接器由更改流支持.变更流在内部使用与复制引擎使用的数据相同的数据,因此应该能够像数据库一样扩展.没有计划允许多个游标,但是,如果您觉得这不符合您的要求,那么您可以配置多个连接器,每个连接器都有自己的更改流游标.

I raised this ticket: https://jira.mongodb.org/browse/KAFKA-121 Got following response: The source connector will only ever produce a single task. This is by design as the source connector is backed by a change stream. Change streams internally use the same data as used by replication engine and as such should be able to scale as the database does. There are no plans to allow multiple cursors, however, should you feel that this is not meeting your requirements, then you can configure multiple connectors and each would have its own change stream cursor.

这篇关于具有多个任务的分布式官方 Mongodb Kafka 源连接器不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆