用于快速检索的索引生成器类似于App Engine中单个查询中的多个表检索 [英] Index Builder for Fast Retrieval similar Multiple table retrieval in Single Query in App Engine

查看:111
本文介绍了用于快速检索的索引生成器类似于App Engine中单个查询中的多个表检索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Google App Engine数据存储HRD中,



我们无法直接使用Query对象或GQL来连接和查询多个表

我只是想知道我的想法是否正确无误如果我们像父母一样在分层次序中构建索引 - 大孩子节点

节点
- 键
- 索引属性
- 集合

如果我们想收集所有的子小孩的&大孩子的。我们可以收集所有在层次结构过滤条件中匹配的键,并提供键的结果



并且在Memcache中我们可以保存每个键并指向DB实体,if缓存并没有在一个单一的查询中使用一组键,我们可以从DB获取所有记录。



优点



)1)快速检索 - Google推荐使用按键获取实体。



2)单个事务足以收集多个表数据。



3)Memcache和Persistent Datastore将表示相同的形式。



4)它将只扫描相关数据到用户组或者父节点。

缺点
$ b $ 1)数据库大小的元数据会增加,所以数据库大小会增加。
$ b $ 2)如果单亲的索引需要超过1MB,那么我们必须在DB中拆分并另存为blob。



这种结构是不错的做法。

In cas e如果我们在层次结构中有较长的深层次,这将解决大量的查询操作,以收集所有依赖于父项的项目。

如果有多个父项 -
收集所有索引并获取与查询相关的密钥。
使用密钥列表收集单个交易中的所有数据。



如果有人发现更多优点或缺点请添加它们并证明此方法的正确性不是。



非常感谢

克里希南

解决方案

有很多事情需要考虑:数据存储 not 关系数据库。你绝对不应该从表格和联合角度来接近你的数据存储。它会导致一个混乱,最可能低效率的设置。



好像你正在试图重组你的数据存储的使用,以提供完整的交易和一致的使用你的数据。 Datastore无法提供此原因的原因是,提供这些保证以及高可用性效率太差。



使用Datastore,您希望能够提供这种能力支持每秒写入不同实体的许多(数千,数十万,数百万等)数据。数据存储提供实体组的概念的原因是,它允许开发人员指定特定的一致性范围。



考虑示例待办事项跟踪服务。你可以定义一个用户和一个Todo类。您不希望为所有Todos提供强大的一致性,因为每次用户添加新笔记时,底层系统都必须确保将其与所有其他用户进行笔记交易。另一方面,使用实体组,您可以说单个用户代表您的一致性单位。这意味着当用户写新笔记时,必须事先更新该笔记,并对该用户笔记进行任何其他修改。这是一个更好的一致性单位,因为随着您的服务扩展到更多的用户,他们不会相互冲突。



您正在讨论创建和管理自己的索引。你几乎肯定不希望从效率的角度来做这件事。此外,你必须非常小心,因为看起来你将有大量的写入单个实体/代表你的表的实体范围。这是已知的数据存储反模式。

数据存储的难点之一是每个项目可能有不同的要求,因此数据布局也不尽相同。对于如何构建数据,绝对没有一个大小适合所有人,但这里有一些资源: //developers.google.com/appengine/articles/life_of_writerel =nofollow>写数据存储时实际发生的事情

  • Datastore如何存储数据 数据存储实体关系建模

  • 数据存储事务隔离


  • In Google App Engine Datastore HRD in Java,

    We can't do joins and query multiple table using Query object or GQL directly

    I just want to know that my idea is correct approach or not

    If We build Index in Hierarchical Order Like Parent - Child - Grand child by node

    Node - Key - IndexedProperty - Set

    In case if we want to collect all the sub child's & grand child's. We can collect all the keys which are matching within the hierarchy filter condition and provide the result of keys

    and In Memcache we can hold each key and pointing to DB entity, if the cache does not have also in a single query using set of keys we can get all the records from DB.

    Pros

    1) Fast retrieval - Google recommends using get entities by keys.

    2) Single Transaction is enough to collect multiple table data.

    3) Memcache and Persistent Datastore will represent the same form.

    4) It will scan only the related data to the group like user or parent node.

    Cons

    1) Meta Data of the DB size will increase so the DB size increase.

    2) If the Index of the Single Parent is going to take more than 1MB then we have to split and Save as blob in the DB.

    This structure is good approach or not.

    In case If we have long deeper levels in the hierarchy, this will solve running lot of query operation to collect all the items dependent to parents.

    In case of multiple parents - Collect all the Indexes and Get the Keys related to the Query. Collect all the data in single transactions using list of keys.

    If any one found some more Pros or Cons Please add them and justify this approach will correct or not.

    Many thanks

    Krishnan

    解决方案

    There are quite a few things going on here that are important to think about:

    Datastore is not a relational database. You definitely should not be approaching your data storage from a tables and join perspective. It will lead to a messy and most likely inefficient setup.

    It seems like you are trying to restructure your use of Datastore to provide complete transactional and strongly consistent use of your data. The reason Datastore cannot provide this natively is that it is too inefficient to provide these guarantees along with high availability.

    With the Datastore, you want to be able to provide the ability to support many (thousands, hundreds of thousands, millions, etc) writes per second to different entities. The reason that the Datastore provides the notion of an entity group is that it allows the developer to specify a specific scope of consistency.

    Consider an example todo tracking service. You might define a User and a Todo kind. You wouldn't want to provide strong consistency for all Todos, since every time a user adds a new note, the underlying system would have to ensure that it was put transactionally with all other users writing notes. On the other hand, using entity groups, you can say that a single User represents your unit of consistency. This means that when a user writes a new note, this has to be updated transactionally with any other modification to that user's notes. This is a much better unit of consistency since as your service scales to more users, they won't conflict with each other.

    You are talking about creating and managing your own indexes. You almost certainly don't want to do this from an efficiency point of view. Further, you'd have to be very careful since it seems you would have a huge number of writes to a single entity / range of entities which represent your table. This is a known Datastore anti-pattern.

    One of the hard parts about the Datastore is that each project may have very different requirements and thus data layout. There is definitely not one size fits all for how to structure your data, but here are some resources:

    这篇关于用于快速检索的索引生成器类似于App Engine中单个查询中的多个表检索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆