了解何时使用状态服务以及何时依赖Azure Service Fabric中的外部持久性 [英] Understanding when to use stateful services and when to rely on external persistence in Azure Service Fabric

查看:59
本文介绍了了解何时使用状态服务以及何时依赖Azure Service Fabric中的外部持久性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我每天晚上都在评估Azure Service Fabric,以替代我们当前的WebApps/CloudServices堆栈,并且对如何确定何时具有状态的服务/角色应成为有状态的参与者以及何时应成为有状态的参与者感到有些不确定具有外部持久状态(Azure SQL,Azure存储和DocumentDB)的无状态参与者.我知道这是一个相当新的产品(至少对普通大众而言),因此在这方面可能还没有很多最佳实践,但是我已经阅读了大多数

I'm spending my evenings evaluating Azure Service Fabric as a replacement for our current WebApps/CloudServices stack, and feel a little bit unsure about how to decide when services/actors with state should be stateful actors, and when they should be stateless actors with externally persisted state (Azure SQL, Azure Storage and DocumentDB). I know this is a fairly new product (to the general public at least), so there's probably not a lot of best practices in regards to this yet, but I've read through most of the documentation made available by Microsoft without finding a definite answer for this.

我正在处理的当前问题域是我们的事件存储;我们的应用程序的某些部分基于事件源和CQRS,并且我正在评估如何将此事件存储移至Service Fabric平台.事件存储将包含大量时间序列数据,并且由于它是持久存储数据的唯一事实来源,因此必须保持一致,将其复制并存储到某种形式的持久存储中.

The current problem domain I'm approaching is our event store; parts of our applications are based on event sourcing and CQRS, and I'm evaluating how to move this event store over to the Service Fabric platform. The event store is going to contain a lot time series-data, and as it's our only source of truth for the data being persisted there it must be consistent, replicated and stored to some form of durable storage.

我考虑过的一种方法是使用有状态的"EventStream"演员.使用事件源的聚合的每个实例都将其事件存储在隔离的流中.这意味着有状态的参与者可以跟踪自己流中的所有事件,并且我已经满足了有关数据存储方式(事务性,复制性和持久性)的要求.但是,某些流可能会变得非常大(成千上万个(如果不是上百万个)事件),这就是我开始不确定的地方.我认为,当需要将这些大数据模型序列化到磁盘或从磁盘反序列化时,拥有大量状态的actor会对系统的性能产生影响.

One way I have considered doing this is with stateful "EventStream" actor; each instance of an aggregate using event sourcing stores its events within an isolated stream. This means the stateful actor could keep track of all the events for its own stream, and I'd have met my requirements as to how the data is stored (transactional, replicated and durable). However, some streams may grow very large (hundreds of thousands, if not millions, of events), and this is where I'm starting to get unsure. Having an actor with a large amount of state will, I imagine, have impacts on the performance of the system when these large data models needs to be serialized to or deserialized from disk.

另一种选择是使这些参与者保持无状态,并使它们仅从某些外部存储(如Azure SQL)读取数据-或仅使用无状态服务代替参与者.

Another option is to keep these actors stateless, and have them just read their data from some external storage like Azure SQL - or just go with stateless services instead of actors.

基本上,角色/服务的状态量何时太多",您应该开始考虑其他处理状态的方法?

Basically, when is the amount of state for an actor/service "too much" and you should start considering other ways of handling state?

此外,服务结构参与者设计中的本节模式:一些反模式文档使我有些困惑:

Also, this section in the Service Fabric Actors design pattern: Some anti-patterns documentation leave me a little bit puzzled:

将Azure Service Fabric Actor作为事务系统进行处理. Azure Service Fabric Actors不是提供ACID的基于两阶段提交的系统.如果我们不实现可选的持久性,并且actor正在运行的机器死亡,那么它的当前状态将随之变化. actor将很快出现在另一个节点上,但是除非我们实现了后备持久性,否则状态将消失.但是,在利用重试,重复过滤和/或幂等设计之间,可以实现高度的可靠性和一致性.

Treat Azure Service Fabric Actors as a transactional system. Azure Service Fabric Actors is not a two phase commit-based system offering ACID. If we do not implement the optional persistence, and the machine the actor is running on dies, its current state will go with it. The actor will be coming up on another node very fast, but unless we have implemented the backing persistence, the state will be gone. However, between leveraging retries, duplicate filtering, and/or idempotent design, you can achieve a high level of reliability and consistency.

如果我们不实现可选的持久性"在这里表示什么?我的印象是,只要您的事务成功修改状态,您的数据就将持久保存到持久性存储中,并至少复制到一部分副本中.本段让我想知道是否会发生某些情况,我的参与者/服务中的状态会丢失,这是否是我需要自己处理的事情.我从文档其他部分的有状态模型中获得的印象似乎抵消了这一说法.

What does "if we do not implement the optional persistance" indicate here? I was under the impression that as long as your transaction modifying the state succeeded, your data was persisted to durable storage and replicated to at least a subset of the replicas. This paragraph leaves me wondering if there are situations where state within my actors/services will get lost, and if this is something I need to handle myself. The impression I got from the stateful model in other parts of the documentation seems to counteract this statement.

推荐答案

您必须采取的一种选择是在参与者中保持状态的某些"状态(假设可以被认为是需要快速处理的热门数据)可用)并将其他所有内容存储在传统"存储基础架构中,例如SQL Azure,DocDB等. 对于太多的本地状态很难有一个通用的规则,但是也许有助于考虑热数据与冷数据. 可靠的Actor还提供了自定义StateProvider的功能,因此您也可以考虑通过特定的策略来实现自定义的StateProvider(通过实现IActorStateProvider),这些策略需要更高效地满足您在数据量,延迟方面的要求,可靠性等(请注意:StateProvider界面上的文档仍然很少,但是如果您想要这样做,我们可以发布一些示例代码).

One option that you have is to keep 'some' of the state in the actor (let's say what could be considered to be hot data that needs to be quickly available) and store everything else on a 'traditional' storage infrastructure such as SQL Azure, DocDB, .... It is difficult to have a general rule about too much local state but, maybe, it helps to think about hot vs. cold data. Reliable Actors also offer the ability to customize the StateProvider so you can also consider implementing a customized StateProvider (by implementing the IActorStateProvider) with the specific policies that you need to be more efficient with the requirements that you have in terms of amount of data, latency, reliability and so on (note: documentation is still very minimal on the StateProvider interface but we can publish some sample code if this is something you want to pursue).

关于反模式:该注释更多地是关于在多个参与者之间实现事务.可靠的参与者可以为参与者边界内的数据可靠性提供充分的保证.由于Actor模型的分布和松散耦合的本质,实现涉及多个actor的事务并不是一件容易的事.如果分布式"交易非常重要,那么可靠服务"编程模型可能更合适.

About the anti-patterns: the note is more about implementing transactions across multiple actors. Reliable Actors provides full guarantee on reliability of the data within the boundaries of an actor. Because of the distributed and loosly coupled nature of the Actor model, implementing transactions that involve multiple actors is not a trivial task. If 'distributed' transactions is a strong requirement, the Reliable Services programming model is probably a better fit.

这篇关于了解何时使用状态服务以及何时依赖Azure Service Fabric中的外部持久性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆