GAE 事务失败和幂等性 [英] GAE transaction failure and idempotency

查看:25
本文介绍了GAE 事务失败和幂等性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Google App Engine 文档包含以下段落:

The Google App Engine documentation contains this paragraph:

注意:如果您的应用程序在提交时收到异常交易,并不总是意味着交易失败.你可以收到 DatastoreTimeoutException,ConcurrentModificationException 或 DatastoreFailureException已提交事务的情况下的例外情况,并且最终会应用成功.只要有可能,让你的数据存储事务是幂等的,因此如果您重复事务,最终结果将是相同的.

Note: If your application receives an exception when committing a transaction, it does not always mean that the transaction failed. You can receive DatastoreTimeoutException, ConcurrentModificationException, or DatastoreFailureException exceptions in cases where transactions have been committed and eventually will be applied successfully. Whenever possible, make your Datastore transactions idempotent so that if you repeat a transaction, the end result will be the same.

等等,什么?似乎有一类非常重要的事务只是不能成为幂等的,因为它们依赖于当前的数据存储状态.例如,一个简单的计数器,就像一个喜欢的按钮.事务需要读取当前计数,增加它,然后再次写出计数.如果交易看起来失败",但实际上并没有失败,而且我无法在客户端告诉这一点,那么我需要再试一次,这将导致一次点击产生两个喜欢".GAE 肯定有办法防止这种情况发生吗?

Wait, what? It seems like there's a very important class of transactions that just simply cannot be made idempotent because they depend on current datastore state. For example, a simple counter, as in a like button. The transaction needs to read the current count, increment it, and write out the count again. If the transaction appears to "fail" but doesn't REALLY fail, and there's no way for me to tell that on the client side, then I need to try again, which will result in one click generating two "likes." Surely there is some way to prevent this with GAE?

这似乎是分布式系统中固有的问题,除了 Guido van Rossum 以外的其他人 - 请参阅此链接:

it seems that this is problem inherent in distributed systems, as per non other than Guido van Rossum -- see this link:

应用引擎数据存储区事务异常

因此,如果您想要高度的可靠性,那么设计幂等交易似乎是必不可少的.

So it looks like designing idempotent transactions is pretty much a must if you want a high degree of reliability.

我想知道是否有可能在整个应用程序中实现一个全局系统来确保幂等性.关键是在数据存储中维护事务日志.客户端将生成一个 GUID,然后在请求中包含该 GUID(相同的 GUID 将在重试同一请求时重新发送).在服务器上,在每个事务开始时,它会在数据存储中查找具有该 ID 的事务实体组中的记录.如果它找到了,那么这是一个重复的事务,所以它会返回而不做任何事情.

I was wondering if it was possible to implement a global system across a whole app for ensuring idempotency. The key would be to maintain a transaction log in the datastore. The client would generated a GUID, and then include that GUID with the request (the same GUID would be re-sent on retries for the same request). On the server, at the start of each transaction, it would look in the datastore for a record in the Transactions entity group with that ID. If it found it, then this is a repeated transaction, so it would return without doing anything.

当然,这需要启用跨组事务,或者将单独的事务日志作为每个实体组的子项.如果失败的实体键查找速度很慢,也会影响性能,因为几乎每个事务都会包含失败的查找,因为大多数 GUID 都是新的.

Of course this would require enabling cross-group transactions, or having a separate transaction log as a child of each entity group. Also there would be a performance hit if failed entity key lookups are slow, because almost every transaction would include a failed lookup, because most GUIDs would be new.

就额外数据存储交互方面的额外成本而言,这可能仍低于我必须使每个事务都具有幂等性的情况,因为这将需要大量检查每个级别的数据存储中的内容.

In terms of the additional $ cost in terms of additional datastore interactions, this would probably still be less than if I had to make every transaction idempotent, since that would require a lot of checking what's in the datastore in each level.

推荐答案

dan wilkerson、simon goldsmith 等.在应用引擎的本地(每个实体组)事务之上设计了一个彻底的全局事务系统.在较高级别上,它使用类似于您描述的 GUID 的技术.dan 处理潜艇写入",即您描述的那些报告失败但后来显示为成功的事务,以及数据存储的许多其他理论和实践细节.erick armbrust 在 tapioca-orm 中实现了 dan 的设计.

dan wilkerson, simon goldsmith, et al. designed a thorough global transaction system on top of app engine's local (per entity group) transactions. at a high level, it uses techniques similar to the GUID one you describe. dan dealt with "submarine writes," ie the transactions you describe that report failure but later surface as succeeded, as well as many other theoretical and practical details of the datastore. erick armbrust implemented dan's design in tapioca-orm.

我不一定建议您实施他的设计或使用木薯粉,但您肯定会对这项研究感兴趣.

i don't necessarily recommend that you implement his design or use tapioca-orm, but you'd definitely be interested in the research.

回答您的问题:很多人实施 GAE 应用程序,这些应用程序使用没有幂等性的数据存储.只有当您需要具有某些类型的保证(如您所描述的保证)的交易时,它才重要.了解您何时需要它们绝对很重要,但您通常不需要.

in response to your questions: plenty of people implement GAE apps that use the datastore without idempotency. it's only important when you need transactions with certain kinds of guarantees like the ones you describe. it's definitely important to understand when you do need them, but you often don't.

数据存储是在 megastore 之上实现的,这在本文中进行了深入描述.简而言之,它在每个实体组内使用多版本并发控制Paxos 用于跨数据中心的复制,这两者都有助于潜艇写入.我不知道数据存储中是否有关于潜艇写入频率的公众号,但如果有,使用这些术语和数据存储邮件列表搜索应该可以找到它们.

the datastore is implemented on top of megastore, which is described in depth in this paper. in short, it uses multi-version concurrency control within each entity group and Paxos for replication across datacenters, both of which can contribute to submarine writes. i don't know if there are public numbers on submarine write frequency in the datastore, but if there are, searches with these terms and on the datastore mailing lists should find them.

亚马逊的 S3 并不是真正可比的系统;它更像是一个 CDN 而不是分布式数据库.亚马逊的 SimpleDB 具有可比性.它最初只提供最终一致性,并最终添加了一种非常有限的交易条件写入,但它没有真正的交易.其他 NoSQL 数据库(redis、mongo、couchdb 等)在事务和一致性方面有不同的变化.

amazon's S3 isn't really a comparable system; it's more of a CDN than a distributed database. amazon's SimpleDB is comparable. it originally only provided eventual consistency, and eventually added a very limited kind of transactions they call conditional writes, but it doesn't have true transactions. other NoSQL databases (redis, mongo, couchdb, etc.) have different variations on transactions and consistency.

基本上,分布式数据库总是在规模、事务广度和一致性保证强度之间进行权衡.这在 eric brewer 的 CAP 定理 中最为人所知,它说权衡的三个轴是一致性,可用性和分区容错性.

basically, there's always a tradeoff in distributed databases between scale, transaction breadth, and strength of consistency guarantees. this is best known by eric brewer's CAP theorem, which says the three axes of the tradeoff are consistency, availability, and partition tolerance.

这篇关于GAE 事务失败和幂等性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆