DynamoDB中原子计数器的可靠性 [英] Reliability of atomic counters in DynamoDB

查看:106
本文介绍了DynamoDB中原子计数器的可靠性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在考虑在我的应用程序中使用 Amazon DynamoDB ,但我对其原子计数器可靠性。

I was considering to use Amazon DynamoDB in my application, and I have a question regarding its atomic counters reliability.

我正在构建分布式应用程序需要同时一致来递增/递减存储在Dynamo属性中的计数器。
我想知道Dynamo的原子计数器在繁重的并发环境中的可靠性如何,在该环境中并发级别非常高(例如,假设平均并发命中率为20k,要得出的结论是几乎每月增加或减少520亿)。

I'm building a distributed application that needs to concurrently, and consistently, increment/decrement a counter stored in a Dynamo's attribute. I was wondering how reliable the Dynamo's atomic counter is in an heavy concurrent environment, where the concurrency level is extremely high (let's say, for example, an average rate of 20k concurrent hits - to get the idea, that would be almost 52 billions increments/decrements per month).

计数器应该是超级可靠的,并且永远都不会丢失。有人在这样的关键环境中测试过DynamoDB吗?

The counter should be super-reliable and never miss a hit. Has somebody tested DynamoDB in such critical environments?

谢谢

推荐答案

DynamoDB通过在多个服务器之间拆分密钥来获取其扩展属性。这类似于Cassandra和HBase等其他分布式数据库的扩展方式。虽然您可以提高DynamoDB的吞吐量,但只需将数据移动到多个服务器即可,现在每个服务器可以处理总的并发连接数/服务器数量。在其常见问题解答有关如何实现最大吞吐量的说明:

DynamoDB gets it's scaling properties by splitting the keys across multiple servers. This is similar to how other distributed databases like Cassandra and HBase scale. While you can increase the throughput on DynamoDB that just moves your data to multiple servers and now each server can handle total concurrent connections / number of servers. Take a look at their FAQ for an explanation on how to achieve max throughput:


问:我能否始终达到我的最高水平?预置的吞吐量?

Amazon DynamoDB假设所有主键之间的访问模式相对随机。您应该设置数据模型,以便您的请求可以使主键之间的流量分布相当均匀。如果您的访问模式高度不均匀或偏斜,则可能无法达到预配置的吞吐量水平。

Amazon DynamoDB assumes a relatively random access pattern across all primary keys. You should set up your data model so that your requests result in a fairly even distribution of traffic across primary keys. If you have a highly uneven or skewed access pattern, you may not be able to achieve your level of provisioned throughput.

在存储数据时,Amazon DynamoDB将表分为多个根据主键的哈希键元素对数据进行分区和分配。与表关联的预配置吞吐量也划分在分区之间;每个分区的吞吐量都根据分配的配额进行独立管理。没有跨分区共享预配置吞吐量。因此,如果工作负载在哈希键值之间相当均匀地分布,则Amazon DynamoDB中的表最能满足预配置的吞吐量级别。在散列键值之间分配请求可在各个分区之间分配请求,这有助于实现完整的预配置吞吐量级别。

When storing data, Amazon DynamoDB divides a table into multiple partitions and distributes the data based on the hash key element of the primary key. The provisioned throughput associated with a table is also divided among the partitions; each partition's throughput is managed independently based on the quota allotted to it. There is no sharing of provisioned throughput across partitions. Consequently, a table in Amazon DynamoDB is best able to meet the provisioned throughput levels if the workload is spread fairly uniformly across the hash key values. Distributing requests across hash key values distributes the requests across partitions, which helps achieve your full provisioned throughput level.

如果您在主键之间的工作负载模式不均匀且无法达到预配置的吞吐量水平,您可以通过进一步提高预配置的吞吐量水平来满足吞吐量需求,这将为每个分区提供更多的吞吐量。但是,建议您考虑修改请求模式或数据模型,以便在主键之间实现相对随机的访问模式。

If you have an uneven workload pattern across primary keys and are unable to achieve your provisioned throughput level, you may be able to meet your throughput needs by increasing your provisioned throughput level further, which will give more throughput to each partition. However, it is recommended that you considering modifying your request pattern or your data model in order to achieve a relatively random access pattern across primary keys.

这意味着直接增加一个密钥将无法扩展,因为该密钥必须位于一台服务器上。还有其他方法可以解决此问题,例如,在具有向DynamoDB刷新刷新的内存聚合(尽管这可能会带来可靠性问题)或分片计数器,其中该增量分布在多个键上,并通过拉动分片中的所有键来读回计数器( http://whynosql.com / scaling-distributed-counters / )。

This means that having one key that is incremented directly will not scale since that key must live on one server. There are other ways to handle this problem, for example in memory aggregation with a flush increment to DynamoDB (though this can have reliability issues) or a sharded counter where the increments are spread over multiple keys and read back by pulling all keys in the sharded counter (http://whynosql.com/scaling-distributed-counters/).

这篇关于DynamoDB中原子计数器的可靠性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆