Azure的活动中心和多个消费群体 [英] Azure event hubs and multiple consumer groups

查看:278
本文介绍了Azure的活动中心和多个消费群体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

需要在以下情况下使用Azure的事件中心的帮助。我认为,消费群体可能是这种情况下正确的选择,但我没能在网上找到一个具体的例子。

Need help on using Azure event hubs in the following scenario. I think consumer groups might be the right option for this scenario, but I was not able to find a concrete example online.

这是问题的粗略描述和使用事件中心的建议的解决方案(我不知道这是否是最佳的解决方案。请问AP preciate您的反馈

Here is the rough description of the problem and the proposed solution using the event hubs (I am not sure if this is the optimal solution. Will appreciate your feedback)

我有多个事件源产生大量的事件数据(来自传感器遥测数据),这需要被保存到我们的数据库和一些分析,像跑步平均值,最小值,最大值应平行

I have multiple event-sources that generate a lot of event data (telemetry data from sensors) which needs to be saved to our database and some analysis like running average, min-max should be performed in parallel.

发送者可以只将数据发送到一个单一端点,而在事件毂应使该数据可用于两个数据处理程序。

The sender can only send data to a single endpoint, but the event-hub should make this data available to both the data handlers.

我想用两个消费群体,第一个将是辅助角色实例的群集需要保存的数据对我们的key-value存储和第二消费群体的关怀将是一个分析引擎(可能去Azure的流分析)。

I am thinking about using two consumer groups, first one will be a cluster of worker role instances that take care of saving the data to our key-value store and the second consumer group will be an analysis engine (likely to go with Azure Stream Analysis).

首先,我怎么设置的消费群体,是有,我需要在发送/接收端做这样的事件的副本出现在所有消费群体的东西吗?

Firstly, how do I setup the consumer groups and is there something that I need to do on the sender/receiver side such that copies of events appear on all consumer groups?

我看过很多例子在线,但他们要么使用 client.GetDefaultConsumerGroup(); 和/或具有由同一劳动者角色的多个实例处理的所有分区。

I did read many examples online, but they either use client.GetDefaultConsumerGroup(); and/or have all partitions processed by multiple instances of a same worker role.

有关我的情况下,当被触发事件,则需要由两个不同的辅助角色并行(一个保存数据,而第二个,做了一些分析)处理

For my scenario, when a event is triggered, it needs to be processed by two different worker roles in parallel (one that saves the data and second one that does some analysis)

感谢您!

推荐答案

TLDR:看起来合理,仅仅通过使用不同的名称与CreateConsumerGroupIfNotExists提出两个消费群体

TLDR: Looks reasonable, just make two Consumer Groups by using different names with CreateConsumerGroupIfNotExists.

消费群体主要是一个概念,那么究竟他们是如何工作取决于你的用户是如何实现的。如你所知,在概念上他们是一组用户一起工作,使每个组收到的所有消息,并在理想的(不会发生)的情况下可能会消耗每条消息一次。这意味着每个消​​费者组的必须通过同一个工作角色的多个实例处理的所有分区。你要这个。

Consumer Groups are primarily a concept so exactly how they work depends on how your subscribers are implemented. As you know, conceptually they are a group of subscribers working together so that each group receives all the messages and under ideal (won't happen) circumstances probably consumes each message once. This means that each Consumer Group will "have all partitions processed by multiple instances of the same worker role." You want this.

这可以用不同的方式来实现。微软已经提供了两种方法来消耗直接从事件集线器以及使用之类的东西这很可能是建立在两个直接的方法顶部流分析选项的消息。第一种方式是在<一个href=\"http://msdn.microsoft.com/en-us/library/azure/microsoft.servicebus.messaging.eventhubreceiver.aspx\">Event毂接收机时,第二个是较高的水平是<一个href=\"http://msdn.microsoft.com/en-us/library/azure/microsoft.servicebus.messaging.eventprocessorhost.aspx\">Event处理器主机。

This can be implemented in different ways. Microsoft has provided two ways to consume messages from Event Hubs directly plus the option to use things like Streaming Analytics which are probably built on top of the two direct ways. The first way is the Event Hub Receiver, the second which is higher level is the Event Processor Host.

我没有用<一个href=\"http://msdn.microsoft.com/en-us/library/azure/microsoft.servicebus.messaging.eventhubreceiver.aspx\">Event集线器接收直接让这个特别的评论是基于理论如何将这些各种各样的系统工作,从文档猜测:虽然他们的created 从<一个href=\"http://msdn.microsoft.com/en-us/library/azure/microsoft.servicebus.messaging.eventhubconsumergroup.aspx\">EventHubConsumerGroups这几乎没有多大意义,因为这些接收器不相互协调。如果使用这些你需要(也!)完成所有的协调和补偿的承诺自己有在某些情况下,如写作相同的事务中计算的聚集偏移到事务数据库的优势。使用这些<一个href=\"http://msdn.microsoft.com/en-us/library/azure/microsoft.servicebus.messaging.eventhubreceiver.aspx\">low水平的接收器,使用相同的Azure的消费群可能不应该(不规范的实用建议)特别有问题具有不同逻辑的消费群体,但你应该在的情况下使用不同的名称,要么做事项或更改为<一个href=\"http://msdn.microsoft.com/en-us/library/azure/microsoft.servicebus.messaging.eventprocessorhost.aspx\">EventProcessorHosts.

I have not used Event Hub Receiver directly so this particular comment is based on the theory of how these sorts of systems work and speculation from the documentation: While they are created from EventHubConsumerGroups this serves little purpose as these receivers do not coordinate with one another. If you use these you will need to (and can!) do all the coordination and committing of offsets yourself which has advantages in some scenarios such as writing the offset to a transactional DB in the same transaction as computed aggregates. Using these low level receivers, having different logical consumer groups using the same Azure consumer group probably shouldn't (normative not practical advice) be particularly problematic, but you should use different names in case it either does matter or you change to EventProcessorHosts.

现在到更多有用的信息,<一个href=\"http://msdn.microsoft.com/en-us/library/azure/microsoft.servicebus.messaging.eventprocessorhost.aspx\">EventProcessorHosts很可能是建立在<顶部href=\"http://msdn.microsoft.com/en-us/library/azure/microsoft.servicebus.messaging.eventhubreceiver.aspx\">EventHubReceivers.他们是一个更高层次的东西,有支持,使多台机器作为一个逻辑消费群体一起工作。下面,我包括来自我的code,使一个<一个轻度编辑片段href=\"http://msdn.microsoft.com/en-us/library/azure/microsoft.servicebus.messaging.eventprocessorhost.aspx\">EventProcessorHost带着一帮留在解释一些选择的意见。

Now onto more useful information, EventProcessorHosts are probably built on top of EventHubReceivers. They are a higher level thing and there is support to enable multiple machines to work together as a logical consumer group. Below I've included a lightly edited snippet from my code that makes an EventProcessorHost with a bunch of comments left in explaining some choices.

//We need an identifier for the lease. It must be unique across concurrently 
//running instances of the program. There are three main options for this. The 
//first is a static value from a config file. The second is the machine's NETBIOS
//name ie System.Environment.MachineName. The third is a random value unique per run which
//we have chosen here, if our VMs have very weak randomness bad things may happen.

string hostName = Guid.NewGuid().ToString();

//It's not clear if we want this here long term or if we prefer that the Consumer 
//Groups be created out of band. Nor are there necessarily good tools to discover 
//existing consumer groups.
NamespaceManager namespaceManager = 
    NamespaceManager.CreateFromConnectionString(eventHubConnectionString);
EventHubDescription ehd = namespaceManager.GetEventHub(eventHubPath);
namespaceManager.CreateConsumerGroupIfNotExists(ehd.Path, consumerGroupName);

host = new EventProcessorHost(hostName, eventHubPath, consumerGroupName, 
    eventHubConnectionString, storageConnectionString, leaseContainerName);
//Call something like this when you want it to start
host.RegisterEventProcessorFactoryAsync(factory)

您会发现,我告诉天青作出新的消费群,如果它不存在,你会如果没有得到一个可爱的错误消息。老实说,我不知道这样做的目的是什么,因为它不包括存储连接字符串需要以相同跨实例为了使EventProcessorHost的协调(和presumably承诺)才能正常工作。

You'll notice that I told Azure to make a new Consumer Group if it doesn't exist, you'll get a lovely error message if it doesn't. I honestly don't know what the purpose of this is because it doesn't include the Storage connection string which needs to be the same across instances in order for the EventProcessorHost's coordination (and presumably commits) to work properly.

下面我提供了从租约由租约及presumably偏移 Azure存储资源管理器图片消费者I组于11月尝试。请注意,虽然我有一个testhub和testhub-testcg容器,这是由于手动命名它们。如果他们是在同一容器中这将是像$默认/ 0与testcg / 0。

Here I've provided a picture from Azure Storage Explorer of leases the leases and presumably offsets from a Consumer Group I was experimenting with in November. Note that while I have a testhub and a testhub-testcg container, this is due to manually naming them. If they were in the same container it would be things like "$Default/0" vs "testcg/0".

正如你可以看到有每个分区一个斑点。我的假设是,这些斑点是用于两件事情。其中第一项是用于分发分区之间情况下,斑点租赁这里看到,第二正在存储已提交该分区中的偏移量。

As you can see there is one blob per partition. My assumption is that these blobs are used for two things. The first of these is the Blob leases for distributing partitions amongst instances see here, the second is storing the offsets within the partition that have been committed.

,而不是将数据推送至消费者群的消费实例所要求的一些在一个分区中的偏移数据存储系统。 EventProcessorHosts是具有其中每个分区才刚刚由一个消费者一次读取的逻辑消费群的一个不错的高层次的方式,并在逻辑消费群中的每个分区所取得的进步是不会忘记的。

Rather than the data getting pushed to the Consumer Groups the consuming instances are asking the storage system for data at some offset in one partition. EventProcessorHosts are a nice high level way of having a logical consumer group where each partition is only getting read by one consumer at a time, and where the progress the logical consumer group has made in each partition is not forgotten.

请记住,每个分区的吞吐量进行测量,这样,如果你即将用尽进入你只能有两个逻辑消费者,都加快速度。因此你要确保你有足够的分区和吞吐量的单位,你可以:

Remember that the throughput per partition is measured so that if you're maxing out ingress you can only have two logical consumers that are all up to speed. As such you'll want to make sure you have enough partitions, and throughput units, that you can:


  1. 阅读所有发送。数据

  2. 24小时保留时间内赶上,如果你落后了几个小时,由于问题。

在结论是:消费群体是你所需要的。使用特定的消费群体,你看这些例子都不错,每个逻辑消费群中的Azure的消费群使用相同的名称,并具有不同的逻辑消费群体使用不同的。

In conclusion: consumer groups are what you need. The examples you read that use a specific consumer group are good, within each logical consumer group use the same name for the Azure Consumer Group and have different logical consumer groups use different ones.

我还没有使用Azure中的分析数据,但preVIEW发布期间,至少你的仅限于默认的消费群。所以,不要使用默认的消费群别的东西,如果你需要两个独立的地段Azure的数据流分析,你可能需要做一些讨厌。但它很容易配置!

I haven't yet used Azure Stream Analytics, but at least during the preview release you are limited to the default consumer group. So don't use the default consumer group for something else, and if you need two separate lots of Azure Stream Analytics you may need to do something nasty. But it's easy to configure!

这篇关于Azure的活动中心和多个消费群体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆