ndb 和一致性:为什么在没有父级的查询中会发生这种行为 [英] ndb and consistency: Why is happening this behavior in a query without a parent

查看:18
本文介绍了ndb 和一致性:为什么在没有父级的查询中会发生这种行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用 Python 和 ndb 做一些工作,但不明白为什么.我将发布案例和上面的代码:

I'm doing some work with Python and ndb and can't understand why. I'll post the cases and the code above:

class Reference(ndb.Model):
  kind = ndb.StringProperty(required=True)
  created_at = ndb.DateTimeProperty(auto_now_add=True)
  some_id = ndb.StringProperty(indexed=True)
  data = ndb.JsonProperty(default={})

这些测试在交互式控制台和 dev_appserver.py 的 --high_replication 选项中运行:

from models import Reference
from google.appengine.ext import ndb
import random

some_id = str(random.randint(1, 100000000000000))
key_id = str(random.randint(1, 100000000000000))

Reference(id=key_id, some_id=some_id, kind='user').put()
print Reference.query(Reference.some_id == some_id, Reference.kind == 'user').get()

# output:
# >> None

为什么???现在,让我们在打印前添加 sleep(1) :

Why ????? Now, let's add a sleep(1) before printing:

from models import Reference
from google.appengine.ext import ndb
import random
from time import sleep

some_id = str(random.randint(1, 100000000000000))
key_id = str(random.randint(1, 100000000000000))

Reference(id=key_id, some_id=some_id, kind='user').put()
sleep(1)
print Reference.query(Reference.some_id == some_id, Reference.kind == 'user').get()

# output:
# >> Reference(key=Key('Reference', '99579233467078'), createdAt=datetime.datetime(2013, 1, 31, 16, 24, 46, 383100), data={}, kind=u'user', some_id=u'25000975872388')

K,让我们假设它正在模拟将文档传播到所有 Google 表格的时间,我永远不会在我的代码中加入睡眠,ofc.现在,让我们移除 sleep 并添加一个父级!

K, let's assume it's emulating the time for spreading the document to all Google's tables, I will never put a sleep into my code, ofc. Now, let's remove the sleep and add a parent!

from models import Reference
from google.appengine.ext import ndb
import random
from time import sleep

some_id = str(random.randint(1, 100000000000000))
key_id = str(random.randint(1, 100000000000000))

Reference(id='father', kind='father').put()

Reference(parent=ndb.Key(Reference, 'father'), id=key_id, some_id_id=some_id, kind='user').put()
print Reference.query(Reference.some_id == some_id, Reference.kind == 'user', ancestor=ndb.Key(Reference, 'father')).get()

# output:
# >> Reference(key=Key('Reference', '46174672092602'), createdAt=datetime.datetime(2013, 1, 31, 16, 24, 46, 383100), data={}, kind=u'user', some_id=u'55143106000841')

现在令人困惑!只需设置一个父级并给我很强的一致性!为什么 ?如果需要提供强一致性,为什么在默认情况下将其插入数据存储区时不让所有文档都具有相同的父级?也许我做的完全错了,有一种方法可以做得更好.请有人指导我!

Now that's confusing! Just set a parent and give me strong consistency! Why ? And if it is required for give strong consistency, why not having all documents the same parent when inserting it in datastore, by default ? Maybe I'm doing it completely wrong and there is a way to do it better. Please, someone guide me!

提前致谢

推荐答案

祖先查询在同一个实体组中运行(因此物理上接近)并且高度一致.

Ancestor queries operate in the same entity group (and therefore physical proximity) and are strongly consistent.

在测试 1 中,HRD 可能看不到 put(),因为它是分布式的,它最终是一致的.

In test 1 the HRD might not see the put() since it is eventually consistent due to it's distributed nature.

在测试 2 中,HRD 有足够的时间变得一致,以便您在查询中看到实体.

In test 2 the HRD has enough time to become consistent so you see the entity in the query.

在测试 3 中,您将它放在同一个实体组中,因此它是高度一致的.

In test 3 you place it in the same entity group so it is strongly consistent.

:为什么不将所有内容都放在同一个实体组中?
A:除非有一堆实体组(然后他们可以将它们推送到大量不同的服务器),否则 GAE 无法分发海量数据集.实体组应该和您需要的一样大,但不能更大(G 有时使用将用户消息"放在用户对象下的示例).此外,由于写入实体组的成员会锁定整个组,因此您面临写入速度限制(如果我记得的话,例如 1 次写入/秒,Alfred 对此进行了讨论).

Q: Why not have everything in the same entity group?
A: GAE can't distribute a massive dataset unless there are a bunch of entity groups (then they can push them out to tons of different servers). Entity groups should be just as large as you need them to be and no larger (G sometimes uses the example of putting a users "messages" under a User object). Also, since writing to a member of an entity group locks the whole group you face write speed limitations (like 1 write/sec if I remember, Alfred has a talk on it).

Q:我的 get() 没有得到对象,不是应该得到的吗?
A:不,只有通过键获取是强一致的,你做了一个 query().get() 这实际上只是 LIMIT 1 的简写.

Q: My get() didn't get the object, isn't is supposed to?
A: No, only get's by key are strongly consistent, you did a query().get() which is really just shorthand for LIMIT 1.

这篇关于ndb 和一致性:为什么在没有父级的查询中会发生这种行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆