ack引起的风暴延迟 [英] Storm latency caused by ack

查看:20
本文介绍了ack引起的风暴延迟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用kafka-storm来连接kafka和storm.我有 3 台服务器运行 Zookeeper、kafka 和 Storm.kafka 中有一个主题test",它有 9 个分区.

I was using kafka-storm to connect kafka and storm. I have 3 servers running zookeeper, kafka and storm. There is a topic 'test' in kafka that has 9 partitions.

在storm拓扑中,KafkaSpout executor的数量是9个,默认任务数也应该是9个.提取"螺栓是唯一连接到日志"喷口 KafkaSpout 的螺栓.

In the storm topology, the number of KafkaSpout executor is 9 and by default, the number of tasks should be 9 as well. And the 'extract' bolt is the only bolt connected to KafkaSpout, the 'log' spout.

从用户界面来看,spout 的失败率很高.但是,bolt 中执行的消息数 = 发出的消息数 - bolt 中失败的消息数.当失败的消息一开始为空时,这个等式几乎匹配.

From the UI, there is a huge rate of failure in the spout. However, he number of executed message in bolt = the number of emitted message - the number of failed mesage in bolt. This equation is almost matched when the failed message is empty at the beginning.

根据我的理解,这意味着 Bolt 确实收到了来自 spout 的消息,但 ack 信号在飞行中被暂停.这就是为什么 spout 中的 ack 数量如此之少的原因.

Based on my understanding, this means that the bolt did receive the message from spout but the ack signals are suspended in flight. That's the reason why the number of acks in spout are so small.

这个问题可以通过增加超时秒数和喷出待处理消息数来解决.但这会导致更多的内存使用,我不能将其增加到无限.

This problem might be solved by increase the timeout seconds and spout pending message number. But this will cause more memory usage and I cannot increase it to infinite.

如果有一种方法可以强制风暴忽略某些 spout/bolt 中的 ack,以便它不会等待该信号直到超时,我正在徘徊.这应该会显着增加吞吐量,但不能保证消息处理.

I was wandering if there is a way to force storm ignore the ack in some spout/bolt, so that it will not waiting for that signal until time out. This should increase the throughout significantly but not guarantee for message processing.

推荐答案

如果你将 ackers 的数量设置为 0,那么 Storm 将自动对每个样本进行 ack.

if you set the number of ackers to 0 then storm will automatically ack every sample.

config.setNumAckers(0);

请注意,UI 仅测量和显示 5% 的数据流.除非你设置

please note that the UI only measures and shows 5% of the data flow. unless you set

config.setStatsSampleRate(1.0d);

尝试增加bolt的超时时间并减少topology.max.spout.pending的数量.

try increasing the bolt's timeout and reducing the amount of topology.max.spout.pending.

另外,确保 spout 的 nextTuple() 方法是非阻塞和优化的.

also, make sure the spout's nextTuple() method is non blocking and optimized.

我还建议您分析代码,也许您的 Storm 队列已满,您需要增加它们的大小.

i would also recommend profiling the code, maybe your storm Queues are being filled and you need to increase their sizes.

    config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE,32);
    config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE,16384);
    config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE,16384);

这篇关于ack引起的风暴延迟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆