如何将数据存储到数据存储? [英] How can I store the date with datastore?

查看:120
本文介绍了如何将数据存储到数据存储?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

数据存储的文档非常清楚,如果包含单调递增的值(如当前的unix时间),那么存在热点问题,但是没有提到好的备选方案,也没有涉及是否存储完全相同(而不是增加值)会创建热点:

不要使用单调递增的值(例如NOW()时间戳)对属性进行索引。索引可能会导致热点,从而影响具有高读写速率的应用程序的云数据存储延迟。
https://cloud.google.com/datastore/docs/best-实践



我想存储每个特定实体被插入到数据存储区的时间,但如果这是不可能的,只存储日期也会起作用。

几乎看起来更容易引起热点,因为每个24小时的新实体都会被添加到同一个索引中(这是我的理解)。 / p>

也许还有更多关于索引如何工作的问题(我无法找到关于它们工作原理的很好解释),并且一次又一次地具有相同的价值指数,但增值不是。



如果有人对此问题有回答,或者需要更好的文档来了解数据存储索引的工作方式,我将不胜感激。

解决方案

您的应用程序是否真正计划查询日期?如果没有,请考虑简单地不索引该属性。如果您只需要很少阅读该属性,请考虑编写一个mapreduce而不是索引。



这个建议是由于BigTable平板电脑的工作原理而给出的,这里描述如下: https:// ikaisays .com / 2011/01/25 / app-engine-datastore-tip-monotonically-increasing-values-are-bad /

我的知识,让一个实体的主键不是一个单调递增的数字更重要。最好是有一个字符串键,所以实体可以更好的分布存储。



但是把这个说成是非专家,我无法想象如果合法需要,那么具有单调值的单个属性的索引将会成为问题。例如,我知道Nomulus代码库,因为我们想要删除比特定时间早的提交日志,所以我们合理地需要按时索引索引。



一个很酷的我认为这些单调索引发生的事情是,当这些平板电脑分裂没有发生时,获取索引中最左边或最右边的元素实际上具有比在索引中间获取东西更好的延迟属性。例如,如果您执行的只是抓取索引中的第一个结果的查询,那么它实际上可能比查询键快。


Datastore documentation is very clear that there is an issue with "hotspots" if you include 'monotonically increasing values' (like the current unix time), however there isn't a good alternative mentioned, nor is it addressed whether storing the exact same (rather than increasing values) would create "hotspots":

"Do not index properties with monotonically increasing values (such as a NOW() timestamp). Maintaining such an index could lead to hotspots that impact Cloud Datastore latency for applications with high read and write rates." https://cloud.google.com/datastore/docs/best-practices

I would like to store the time when each particular entity is inserted into the datastore, if that's not possible though, storing just the date would also work.

That almost seems more likely to cause "hotspots" though, since every new entity for 24 hours would get added to the same index (that's my understanding anyway).

Perhaps there's something more going on with how indexes work (I am having trouble finding great explanations of exactly how they work) and having the same value index over and over again is fine, but incrementing values is not.

I would appreciate if anyone has an answer to this question, or else better documentation for how datastore indexes work.

解决方案

Is your application actually planning on querying the date? If not, consider simply not indexing that property. If you only need to read that property infrequently, consider writing a mapreduce rather than indexing.

That advice is given due to the way BigTable tablets work, which is described here: https://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/

To the best of my knowledge, it's more important to have the primary key of an entity not be a monotonically increasing number. It would be better to have a string key, so the entity can be stored with better distribution.

But saying this as a non-expert, I can't imagine that indexes on individual properties with monotonic values would be as problematic, if it's legitimately needed. I know with the Nomulus codebase for example, we had a legitimate need for an index on time, because we wanted to delete commit logs older than a specific time.

One cool thing I think happens with these monotonic indexes is that, when these tablet splits don't happen, fetching the leftmost or rightmost element in the index actually has better latency properties than fetching stuff in the middle of the index. For example, if you do a query that just grabs the first result in the index, it can actually go faster than a key lookup.

这篇关于如何将数据存储到数据存储?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆