将所有数据存储实体放在一个组中的目的是什么? [英] What would be the purpose of putting all datastore entities in a single group?

查看:184
本文介绍了将所有数据存储实体放在一个组中的目的是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经开始研究一个使用Google Datastore的现有项目,其中为每个实体的某些实体类型分配了相同的祖先。例子:

pre $ class BaseModel(ndb.Model):
@classmethod
def create(cls,* * kwargs):
return cls(parent = cls.make_key(),** kwargs)
@classmethod
def make_key(cls):
return ndb.Key('Group ',cls.key_name())

class Vehicle(BaseModel):
@classmethod
def key_name(cls):
return'vehicle_group'



 钥匙(组,'vehicle_group',车辆,5068993417183232)

不是'Group'或实体'vehicle_group',但在这些文档中没有问题:请注意,与文件系统不同,父实体不需要实际存在。从阅读中了解到,这可能会带来性能上的好处。一个亲属的所有实体d共同驻留在分布式数据存储中。

但是,将所有这些实体放在一个组中会让我脑海里产生问题,因为这个项目会扩展,每秒一次的写入限制将适用于整个类型。这个团体似乎没有任何交易原因。

项目中没有人知道它最初是如何完成的。我的问题是:


  • 有人知道这个xxx_group单个实体方案在哪里来自
  • 将单个实体组中的许多实体分组,可以提供至少两个我能想到的优点:


    • 执行的能力(祖先)查询内部事务 - 在事务内部不允许非祖先(或跨组)查询

    • 在同一事务内部访问许多实体的能力 - 跨组事务限制为最大25实体组



    1个写入/秒/组限制可能完全不是可扩展性问题应用程序(想想写一次读过很多类型的应用程序,或者每秒写入1次的应用程序绰绰有余)。



    至于机制, (唯一的)父实体键是 ndb.Key('Group',xxx_group)键(其具有xxx_group键ID)。相应的实体或其模型不需要存在(除非实体本身需要创建,bu似乎并非如此)。如果需要,父键只用于在数据存储中建立组的名称空间。



    您可以看到 实体键文档,请查看 Message use(除 Message )仅仅是祖先路径中的父实体,而不是根实体):


    类Revision(ndb.Model):
    message_text = ndb.StringProperty()


      ndb.Key('Account','sandy@foo.com','Message',123,'Revision','1')
    ndb。 Key('Account','sandy@foo.com','Message',123,'Revision','2')
    ndb.Key('Account','larry@foo.com','Message ',456,'Revision','1')
    ndb.Key('Account','larry@foo.com','Message',789,'Revision','2')

    ...



    注意,消息不是模型类。这是因为我们使用Message纯粹是为了分组修订而不是存储数据。


    I have started working on an existing project which uses Google Datastore where for some of the entity kinds every entity is assigned the same ancestor. Example:

    class BaseModel(ndb.Model):
        @classmethod
        def create(cls, **kwargs):
            return cls(parent=cls.make_key(), **kwargs)
        @classmethod
        def make_key(cls):
            return ndb.Key('Group', cls.key_name())
    
    class Vehicle(BaseModel):
        @classmethod
        def key_name(cls):
            return 'vehicle_group'
    

    So the keys end up looking like this:

    Key(Group, 'vehicle_group', Vehicle, 5068993417183232)
    

    There is no such kind as 'Group' nor entity 'vehicle_group' but that's OK in these docs: "note that unlike in a file system, the parent entity need not actually exist".

    I understand from reading that this might have a performance benefit in that all the entities of a kind are colocated in the distributed datastore.

    But putting all these entities in a single group would in my mind create problems as this project scales, and the once per second write limit would apply to the entire kind. There doesn't appear to be any transactional reason for the group.

    No one on the project knows why it was originally done like this. My questions are:

    • Does anyone know where this "xxx_group" single entity scheme comes from?
    • And is it as bunk as it appears to be?

    解决方案

    Grouping many entities inside a single entity group offers at least 2 advantages I can think of:

    • ability to perform (ancestor) queries inside transactions - non-ancestor (or cross-group) queries are not allowed inside transactions
    • ability to access many entities inside the same transaction - cross-group transactions are limited to max 25 entity groups

    The 1 write/second/group limit might not be a scalability issue at all for some applications (think write once read a lot kind of apps, for example, or apps for which 1 write per sec is more than enough).

    As for the mechanics, the (unique) parent "entity" key for the group is the ndb.Key('Group', "xxx_group") key (which has the "xxx_group" key ID). The corresponding "entity" or its model doesn't need to exist (unless the entity itself needs to be created, bu that doesn't appear to be the case). The parent key is used simply to establish the group's "namespace" in the datastore, if you want.

    You can see a somehow similar use in the examples from the Entity Keys documentation, check out the Message use (except Message is just a "parent" entity in the ancestor path, but not the root entity):

    class Revision(ndb.Model): message_text = ndb.StringProperty()

    ndb.Key('Account', 'sandy@foo.com', 'Message', 123, 'Revision', '1')
    ndb.Key('Account', 'sandy@foo.com', 'Message', 123, 'Revision', '2')
    ndb.Key('Account', 'larry@foo.com', 'Message', 456, 'Revision', '1')
    ndb.Key('Account', 'larry@foo.com', 'Message', 789, 'Revision', '2')
    

    ...

    Notice that Message is not a model class. This is because we are using Message purely as a way to group Revisions, not to store data.

    这篇关于将所有数据存储实体放在一个组中的目的是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆