ndb和一致性:为什么在没有父项的查询中发生这种行为 [英] ndb and consistency: Why is happening this behavior in a query without a parent

查看:114
本文介绍了ndb和一致性:为什么在没有父项的查询中发生这种行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python和ndb做一些工作,但无法理解原因。我将发布上述案例和代码:

models.py



 <$ c 
class = ndb.StringProperty(required = True)
created_at = ndb.DateTimeProperty(auto_now_add = True)
some_id = ndb.StringProperty(indexed) = True)
data = ndb.JsonProperty(default = {})

这些测试在交互式控制台中运行,并且 - dev_appserver.py的--high_replication选项:

测试1



  from models import参考
from google.appengine.ext import ndb
import random

some_id = str(random.randint (1,100000000000000))
key_id = str(random.randint(1,100000000000000))

引用(id = key_id,some_id = some_id,kind ='user')。put )
print Reference.query(Reference.some_id == some_id,Reference.kind =='user')。get()

#output:
#>>无

为什么?????现在,让我们在打印之前添加一个睡眠(1):

测试2



  from models import参考
from google.appengine.ext import ndb
导入随机
导入睡眠

some_id = str(random.randint(1 ,100000000000000))
key_id = str(random.randint(1,100000000000000))

引用(id = key_id,some_id = some_id,kind ='user')。put()
sleep(1)
print Reference.query(Reference.some_id == some_id,Reference.kind =='user')。get()

#输出:
#>> reference(key = Key('Reference','99579233467078'),createdAt = datetime.datetime(2013,1,31,16,24,46,383100),data = {},kind = u'user',some_id = u'25000975872388')

K,假设它正在模拟将文档分发到所有Google表格的时间,我永远不会把我的代码睡了,ofc。现在,让我们删除睡眠并添加父母!



测试3



  from models import参考
from google.appengine.ext import ndb
导入随机
从时间导入sleep

some_id = str(random.randint(1, 100000000000000))
key_id = str(random.randint(1,100000000000000))

引用(id ='father',kind ='father')。put()

Reference(parent = ndb.Key(Reference,'father'),id = key_id,some_id_id = some_id,kind ='user')。put()
print Reference.query(Reference.some_id = = some_id,Reference.kind =='user',ancestor = ndb.Key(Reference,'father'))。get()

#output:
#>> reference(key = Key('Reference','46174672092602'),createdAt = datetime.datetime(2013,1,31,16,24,46,383100),data = {},kind = u'user',some_id = u'55143106000841')

现在令人困惑!只要设置一个家长,给我强大的一致性!为什么?
如果需要提供强大的一致性,为什么在默认情况下将所有文档插入数据存储库时都不是同一父文件?
也许我完全错了,有办法做得更好。请有人指导我!



预先致谢 祖先查询在同一个实体组中运行(因此在物理上接近)并且强烈一致。



在测试1中,HRD可能看不到put(),因为它最终是一致的,因为它是分布式的。



在测试2中,HRD有足够的时间保持一致,以便在查询中看到实体。



在测试3中,您将它放在同一个实体组中它是非常一致的。



Q :为什么不把所有东西放在同一个实体组中?

A :GAE无法分发海量数据集,除非存在大量实体组(然后他们可以将它们推送到大量不同的服务器上)。实体组应该和你需要的一样大,不要大(G有时使用在用户对象下放置用户消息的例子)。另外,由于写入实体组的成员会锁定你面临的整个组写速度的限制(如果我记得,阿尔弗雷德有一个谈话就写1个写/秒)。



Q :我的get()没有得到该对象,是不是应该?

A :不,只有通过键得到的是非常一致的,你做了一个query().get(),它实际上只是LIMIT 1的简写。


I'm doing some work with Python and ndb and can't understand why. I'll post the cases and the code above:

models.py

class Reference(ndb.Model):
  kind = ndb.StringProperty(required=True)
  created_at = ndb.DateTimeProperty(auto_now_add=True)
  some_id = ndb.StringProperty(indexed=True)
  data = ndb.JsonProperty(default={})

Those tests are running in the Interactive console and --high_replication option to dev_appserver.py:

Test 1

from models import Reference
from google.appengine.ext import ndb
import random

some_id = str(random.randint(1, 100000000000000))
key_id = str(random.randint(1, 100000000000000))

Reference(id=key_id, some_id=some_id, kind='user').put()
print Reference.query(Reference.some_id == some_id, Reference.kind == 'user').get()

# output:
# >> None

Why ????? Now, let's add a sleep(1) before printing:

Test 2

from models import Reference
from google.appengine.ext import ndb
import random
from time import sleep

some_id = str(random.randint(1, 100000000000000))
key_id = str(random.randint(1, 100000000000000))

Reference(id=key_id, some_id=some_id, kind='user').put()
sleep(1)
print Reference.query(Reference.some_id == some_id, Reference.kind == 'user').get()

# output:
# >> Reference(key=Key('Reference', '99579233467078'), createdAt=datetime.datetime(2013, 1, 31, 16, 24, 46, 383100), data={}, kind=u'user', some_id=u'25000975872388')

K, let's assume it's emulating the time for spreading the document to all Google's tables, I will never put a sleep into my code, ofc. Now, let's remove the sleep and add a parent!

Test 3

from models import Reference
from google.appengine.ext import ndb
import random
from time import sleep

some_id = str(random.randint(1, 100000000000000))
key_id = str(random.randint(1, 100000000000000))

Reference(id='father', kind='father').put()

Reference(parent=ndb.Key(Reference, 'father'), id=key_id, some_id_id=some_id, kind='user').put()
print Reference.query(Reference.some_id == some_id, Reference.kind == 'user', ancestor=ndb.Key(Reference, 'father')).get()

# output:
# >> Reference(key=Key('Reference', '46174672092602'), createdAt=datetime.datetime(2013, 1, 31, 16, 24, 46, 383100), data={}, kind=u'user', some_id=u'55143106000841')

Now that's confusing! Just set a parent and give me strong consistency! Why ? And if it is required for give strong consistency, why not having all documents the same parent when inserting it in datastore, by default ? Maybe I'm doing it completely wrong and there is a way to do it better. Please, someone guide me!

Thanks in advance

解决方案

Ancestor queries operate in the same entity group (and therefore physical proximity) and are strongly consistent.

In test 1 the HRD might not see the put() since it is eventually consistent due to it's distributed nature.

In test 2 the HRD has enough time to become consistent so you see the entity in the query.

In test 3 you place it in the same entity group so it is strongly consistent.

Q: Why not have everything in the same entity group?
A: GAE can't distribute a massive dataset unless there are a bunch of entity groups (then they can push them out to tons of different servers). Entity groups should be just as large as you need them to be and no larger (G sometimes uses the example of putting a users "messages" under a User object). Also, since writing to a member of an entity group locks the whole group you face write speed limitations (like 1 write/sec if I remember, Alfred has a talk on it).

Q: My get() didn't get the object, isn't is supposed to?
A: No, only get's by key are strongly consistent, you did a query().get() which is really just shorthand for LIMIT 1.

这篇关于ndb和一致性:为什么在没有父项的查询中发生这种行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆