将事件路由到eventhub EventProcessor [英] Route events to eventhub EventProcessor

查看:110
本文介绍了将事件路由到eventhub EventProcessor的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有不同类型的事件.例如,某些数据是遥测数据,某些数据是错误信息等.

I have events of different types. For example, some data is telemetry data, some is error information etc.

我认为创建多个IEventProcessor实现是一个好主意,每种事件类型都实现一个.因此,每个实现将以不同的方式处理事件.就像写入文件或数据库一样.

I thought it would be a good idea to create several IEventProcessor implementations, one for each event type. So each implementation will handle the event differently. Like writing to file or to database.

将事件路由到特定EventProcessor的最佳方法是什么?

What's the best way to route events to a specific EventProcessor?

  • 我应该让EventProcessor监视特定的分区键吗?
  • 我应该使用EventProcessorHost的构造函数来指定使用者组名称吗?如果是这样,如何使用EventHubClient发送给特定的消费者组?我看不到用于在其中指定消费者组的选项.
  • 我应该不执行上述任何操作,而只是检查传入的事件数据中的特定属性,而忽略那些我不感兴趣的属性吗?

我必须说,我发现分区密钥和消费者组(如果有)之间的关系记录不清.

I must say that I find the relation between partitionkey and consumergroup (if there is any) badly documented.

我使用了选项2,但到目前为止,每个EventProcessor都会从所有使用者组名称中获取消息,而不仅仅是EventProcessorHost构造函数中指定的消息.

I've used option 2 but so far each EventProcessor get messages from all the consumergroupnames, not just the one specified in the EventProcessorHost constructor.

推荐答案

好问题!

在回答之前-我想重申在构建EventHub时遵循的几项原则.

Before answering - I wanted to re-iterate couple of principles we followed while building EventHubs.

  • 我们希望事件中心是一个高度耐用,高吞吐量的事件接收管道.当我们已经在Azure上拥有现有的pub-sub服务(例如Queues/Topics(类似于AWS SQS,Google Pub-sub))时提出新服务的主要差异因素是,提供更高的吞吐量变量(当然是& ,并具有低延迟).我们能够实现这一目标-做出权衡取舍-我们不执行任何按邮件计算-例如在Service上执行Filter等.当您需要按消息的语义时(例如,每条消息进行重复数据删除,确认每条消息的接收),在这种情况下,基于每条消息的属性进行过滤-并且吞吐量要求很低-队列/主题可能成为最好的选择.

  • We wanted Event Hubs to be a highly durable, high-throughput, event ingestion pipeline. The major differentiating factor for coming up with a new Service while we already had existing pub-sub services on Azure like Queues/Topics (similar to AWS SQS, Google Pub-sub) - is, to provide higher throughput variant (& of course, with low latency) . We were able to deliver on this goal - with the trade-off that - we don't perform any per-message computations - like executing a Filter etc. on the Service. When you need per-message semantics - like de-dup per message, acknowledge receive per message, in your case, filter based on a property per message - and the throughput requirements are low - Queue/Topic might be your best bet.

我们还预想,发件人(或发布者)的规模要大得多,并且会根据情况而有很大差异.因此,我们介绍了3种发送模式(发送,使用PartitionKey发送,直接发送至分区). 因此,在发送时,您会注意到PartitionKey的概念-它将转换为特定分区(将PartitionKey视为EventHub服务的线索,以计算将具有相同PartitionKey的所有事件放置在同一Partition上的事件).但是,在使用事件时,没有EventHubs直接公开的PartitionKey的概念. b/w ConsumerGroups和PartitionKey没有关联.

We also envisioned that, Senders (or publishers) are at a much higher scale and vary significantly based on scenario. So we introduced 3 Sending patterns (Send, Send with PartitionKey, Send directly to a Partition). So, while sending you will notice the notion of PartitionKey - which will in turn translate to a Particular partition (Consider PartitionKey as a Clue to EventHub Service to Calculate placement of all events with the Same PartitionKey to be on Same Partition). But, while consuming Events, there is no notion of PartitionKey directly exposed by EventHubs. There is no relation b/w ConsumerGroups and PartitionKey.

和接收方通常只是计算角色,并且数量有限.因此,我们公开了1个通用的接收(消费)模式-从分区接收.现在,在消费事件时,可能基于不同因素而存在不同类型的消费者-例如:消费速度(实时与历史数据)或数据类型-因此-我们暴露了多个消费群体.尽管您可以创建20个CG,但是我们这里有一个有趣的局限性-购买的每个吞吐量单元最多可以产生1 MBPS的输入和2 MBPS的输出-如果在发送端充分利用,它将限制为2 CG.因此,如果您要处理的是完全相同的流,并且具有不同的方式来处理每个事件,但是每个事件都需要花费相同的时间来处理-那么,使用相同的ConsumerGroup更为有意义.

and Receivers are usually just the computation roles and are limited in number. So, we exposed 1 generic Receive (consume) pattern - Receive from a Partition. Now, while consuming events, there might be different types of Consumers based on different factors - for ex: the Speed of consumption (Real-time Vs Historical), or type of data - and hence - we exposed multiple consumer groups. Although you could create 20 CGs, one interesting limitation we have here is that - each thruput unit purchased can yield 1 MBPS in and 2 MBPS out - which if fully utilized on Send side will limit it to 2 CGs. So, If you are processing the exact Same stream and have different ways to handle each event but each of them takes equal amount of time to process - then, using the same ConsumerGroup makes more sense.

要回答您的问题:这真的取决于您.

以下是几种解决方案:

  • 因此,您的方案中包含多种事件类型-您需要预见/确定是否有任何方案,其中需要由单个使用者/用户读取和处理所有类型的事件/处理器.一个例子:我们通常看到的是-使用一个ConsumerGroup想要计数所有错误,而其他Consumer组实际上将针对每种错误类型执行特定的操作.如果不需要,则将每个EventType发送到不同的eventhub,然后将1个使用者组与特定的IEventProcessor一起使用是一种选择.

  • Since, there is a mix of event types in your scenario - you will need to foresee/decide if you have any scenarios, where there is a need to read and Process all types of events by a single consumer/processor. One ex: we usually see is - using one ConsumerGroup you want a count of all errors and other consumer group would actually perform specific action per error Type. If, you don't need that - sending each EventType to different eventhubs and then, using 1 consumer group with the specific IEventProcessor - is an option.

如果有需要将所有事件发送到同一EventHub的场景,并且如果您知道某些eventType的处理速度非常快(或需要),那么-您应该考虑使用不同的使用者组,每个使用者组都绑定到特定的IEventProcessor实现,它将忽略其他EventType. 例如:如果需要实时处理ErrorInfo事​​件和特殊事件,并且由于处理速度慢或加载高峰时间长而使遥测数据可以命中15分钟,我会选择一个ConsumerGroup并将其命名为Real-time,然后将其与IEventProcessor绑定,该IEventProcessor可以处理两种类型-错误和特殊.创建第二个ConsumerGroup并将其与处理遥测事件的IEventProcessor绑定.

If you have scenarios where there is a need to Send all events to the same EventHub, and if you know that processing speed of some of the eventTypes is(or need to be) very fast - you should consider using different consumergroup, with Each consumer group tied to a specific IEventProcessor implementation and it will ignore the other EventTypes. For ex: if the ErrorInfo events and Special events need attention at Real-time and if the telemetry data is okay to take a hit of 15 mins due to slow processing or high-peak load times - I would go for one ConsumerGroup and name it Real-time and tie it with IEventProcessor which handles 2 types - Error and Special. Create 2nd ConsumerGroup and tie it with an IEventProcessor which handles Telemetry events.

这篇关于将事件路由到eventhub EventProcessor的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆