具有多个任务的分布式Mongodb Kafka官方发布源连接器无法正常工作 [英] Distributed Official Mongodb Kafka Source Connector with Multiple tasks Not working

查看:174
本文介绍了具有多个任务的分布式Mongodb Kafka官方发布源连接器无法正常工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Windows机器上运行Apache Kafka,其中有两个Kafka-Connect-Workers(端口8083、8084)和一个带有三个分区的主题(一个副本). 我的问题是,每当我关闭其中一个时,我都能看到故障转移到其他Kafka-Connect工作程序,但是由于任务数始终为一,因此负载平衡没有发生. 我正在使用Task.max = 6的官方MongoDB-Kafka-Connector作为Source(ChangeStream). 我尝试使用多个线程更新MongoDB,以便将更多数据推送到Kafka-Connect中,并可能使Kafka-Connect创建更多任务.即使在数据量较大的情况下,任务计数仍将保持不变.

I am running Apache Kafka on my Windows machine with two Kafka-Connect-Workers(Port 8083, 8084) and one topic with three partitions(replication of one). My issue is that I am able to see the fail-over to other Kafka-Connect worker whenever I shutdown one of them, but load balancing is not happening because the number of tasks is always ONE. I am using Official MongoDB-Kafka-Connector as Source(ChangeStream) with tasks.max=6. I tried updating MongoDB with multiple threads so that it could push more data into Kafka-Connect and may perhaps make Kafka-Connect create more tasks. Even under higher volume of data, tasks count remain one.

如何确认只有一项任务正在运行?那是通过api"http://localhost:8083/connectors/mongodb-connector/status" : 回复: { "name":"mongodb-connector", "connector": { "state":"RUNNING", "worker_id":"xx.xx.xx.xx:8083" } "tasks": [ { "id": 0, "state": "RUNNING" "worker_id": "xx.xx.xx.xx:8083" } ], "type": "source" } 我在这里想念什么吗?为什么不创建更多任务?

How I confirmed only one task is running? That's through the api "http://localhost:8083/connectors/mongodb-connector/status" : Response: { "name":"mongodb-connector", "connector": { "state":"RUNNING", "worker_id":"xx.xx.xx.xx:8083" } "tasks": [ { "id": 0, "state": "RUNNING" "worker_id": "xx.xx.xx.xx:8083" } ], "type": "source" } Am I missing something here? Why more tasks are not created?

推荐答案

似乎这是MongoDB Kafka官方源连接器的行为.这是我在Ross Lawley(MongoDB开发人员)的另一个论坛上得到的答案:

It seems this is the behavior of Official MongoDB Kafka Source Connector. This is the answer I got on another forum from Ross Lawley(MongoDB developer):

在1.2.0之前,接收器连接器仅支持单个任务. 源连接器仍仅支持单个任务,这是因为它使用单个更改流"游标.这足以监视和发布集群范围,数据库范围或单个集合中的更改.

Prior to 1.2.0 only a single task was supported by the sink connector. The Source connector still only supports a single task, this is because it uses a single Change Stream cursor. This is enough to watch and publish changes cluster wide, database wide or down to a single collection.

我举了这张票: https://jira.mongodb.org/browse/KAFKA- 121 得到以下回应: 源连接器将仅产生单个任务. 这是设计使然,因为源连接器受更改流的支持.变更流在内部使用与复制引擎相同的数据,因此应该能够像数据库那样进行扩展. 没有计划允许多个游标,但是,如果您认为这不符合您的要求,则可以配置多个连接器,每个连接器都有自己的更改流游标.

I raised this ticket: https://jira.mongodb.org/browse/KAFKA-121 Got following response: The source connector will only ever produce a single task. This is by design as the source connector is backed by a change stream. Change streams internally use the same data as used by replication engine and as such should be able to scale as the database does. There are no plans to allow multiple cursors, however, should you feel that this is not meeting your requirements, then you can configure multiple connectors and each would have its own change stream cursor.

这篇关于具有多个任务的分布式Mongodb Kafka官方发布源连接器无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆