Storm acker的混乱和有保证的消息处理 [英] Confusion of Storm acker and guaranteed message processing

查看：78 发布时间：2020/9/4 22:53:12 apache-storm

本文介绍了Storm acker的混乱和有保证的消息处理的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

现在我正在学习Storm的保证消息处理，并且对此部分中的一些概念感到困惑.

Now I am learning Storm's Guaranteeing Message Processing and am confused by some concepts in this part.

为了确保喷口发出的消息得到完全处理，Storm使用acker来实现.每次喷口发出一个元组时，acker都将分配初始化为0的"ack val"以存储元组树的状态.每当此元组的下游螺栓发出新元组或确认旧"元组时，元组ID将与"ack val"进行XOR.确认程序仅需要检查"ack val"是否为0即可知道元组已被完全处理.让我们看看下面的代码:

To guarantee a message emitted by a spout is fully processed, Storm uses acker to achieve this. Each time a spout emits a tuple, acker will assign "ack val" initialized as 0 to store the status of the tuple tree. Each time the downstream bolts of this tuple emit new tuple or ack an "old" tuple, the tuple ID will be XOR with "ack val". The acker only needs to check whether "ack val" is 0 or not to know the tuple has been fully processed. Let's see the code below:

public class WordReader implements IRichSpout {
    ... ...
while((str = reader.readLine()) != null){
    this.collector.emit(new Values(str), str);
    ... ...
}

上面的代码段是"Storm入门"教程中的单词计数程序中的一个喷口.在发出方法中，第二个参数"str"是messageId.我对此参数感到困惑: 1)据我了解，每次以元组或螺栓的形式发出元组(即消息)时，Storm都有责任为该消息分配64位messageId.那是对的吗?还是在这里，"str"只是该消息的可读别名? 2)不管对1)的回答是什么，这里的"str"在两个不同的消息中都是相同的词，因为在文本文件中应该有许多重复的词.如果是这样，那么Storm如何区分不同的消息?这个参数的含义是什么? 3)在某些代码段中，我看到一些喷口使用以下代码在Spout发射方法中设置消息ID:

The code piece above is a spout in word count program from "Getting Started with Storm" tutorial. In the emit method, the 2nd parameter "str" is the messageId. I am confused by this parameter: 1) As I understand, each time a tuple (i.e., a message) is emitted no matter in spouts or in bolts, it should be Storm's responsibility to assign a 64-bit messageId to that message. Is that correct? Or here "str" is just a human-readable alias to this message? 2) No matter what's answer to 1), here "str" would be the same word in two different messages because in a text file there should be many duplicate words. If this is true, then how does Storm differentiate different messages? And what's the meaning of this parameter? 3) In some code piece, I see some spouts use the following code to set the message Id in Spout emit method:

public class RandomIntegerSpout extends BaseRichSpout {
    private long msgId = 0;
    collector.emit(new Values(..., ++msgId), msgId);
}

这更接近我的想法:不同邮件之间的邮件ID应该完全不同.但是对于此代码段，另一个困惑是:不同执行者之间的私有字段"msgId"会发生什么?因为每个执行程序都有自己的msgId初始化为0，所以不同执行程序中的消息将从0、1、2等命名.那么Storm如何区分这些消息?

This is much closer to what I think it should be: the message ID should be totally different across different messages. But for this code piece, another confusion is: what will happen to private field "msgId" across different executors? Because each executor has its own msgId initialized as 0, then messages in different executors will be named from 0, 1, 2, and so on. Then how does Storm differentiate these messages?

我是Storm的新手，所以也许这些问题是幼稚的.希望有人可以帮助我找出答案.谢谢！

I am novice to Storm, so maybe these problems are naive. Hope someone could help me to figure out. Thanks!

Storm acker的混乱和有保证的消息处理 [英] Confusion of Storm acker and guaranteed message processing

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Storm acker的混乱和有保证的消息处理 [英] Confusion of Storm acker and guaranteed message processing

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭