如何在数据存储而不是数据库中思考? [英] How to think in data stores instead of databases?

查看:29
本文介绍了如何在数据存储而不是数据库中思考?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,Google App Engine 使用 Google Datastore(而非标准数据库)来存储数据.有人有使用 Google Datastore 而不是数据库的任何提示吗?似乎我已经训练我的思想在直接映射到表结构的对象关系中进行 100% 思考,现在很难看到任何不同的东西.我可以理解 Google Datastore 的一些好处(例如性能和分发数据的能力),但牺牲了一些良好的数据库功能(例如连接).

As an example, Google App Engine uses Google Datastore, not a standard database, to store data. Does anybody have any tips for using Google Datastore instead of databases? It seems I've trained my mind to think 100% in object relationships that map directly to table structures, and now it's hard to see anything differently. I can understand some of the benefits of Google Datastore (e.g. performance and the ability to distribute data), but some good database functionality is sacrificed (e.g. joins).

使用过 Google Datastore 或 BigTable 的任何人对与他们合作有什么好的建议吗?

Does anybody who has worked with Google Datastore or BigTable have any good advice to working with them?

推荐答案

与传统"关系数据库相比,App Engine 数据存储区主要有两点需要习惯:

There's two main things to get used to about the App Engine datastore when compared to 'traditional' relational databases:

  • 数据存储区不区分插入和更新.当您对实体调用 put() 时,该实体将使用其唯一键存储到数据存储中,并且任何具有该键的内容都会被覆盖.基本上,数据存储区中的每种实体都像一个巨大的地图或排序列表.
  • 正如您所提到的,查询受到的限制要大得多.首先,没有加入.

要意识到的关键——以及这两种差异背后的原因——是 Bigtable 基本上就像一个巨大的有序字典.因此,放置操作只是设置给定键的值——不管该键之前的任何值如何,并且获取操作仅限于获取单个键或连续范围的键.索引可以实现更复杂的查询,索引基本上只是它们自己的表,允许您实现更复杂的查询作为对连续范围的扫描.

The key thing to realise - and the reason behind both these differences - is that Bigtable basically acts like an enormous ordered dictionary. Thus, a put operation just sets the value for a given key - regardless of any previous value for that key, and fetch operations are limited to fetching single keys or contiguous ranges of keys. More sophisticated queries are made possible with indexes, which are basically just tables of their own, allowing you to implement more complex queries as scans on contiguous ranges.

一旦你吸收了它,你就拥有了理解数据存储的功能和限制所需的基本知识.看似随意的限制可能更有意义.

Once you've absorbed that, you have the basic knowledge needed to understand the capabilities and limitations of the datastore. Restrictions that may have seemed arbitrary probably make more sense.

这里的关键是,尽管这些限制了您在关系数据库中可以执行的操作,但这些相同的限制使扩展到 Bigtable 旨在处理的那种规模变得切实可行.您根本无法执行在纸面上看起来不错但在 SQL 数据库中速度极慢的那种查询.

The key thing here is that although these are restrictions over what you can do in a relational database, these same restrictions are what make it practical to scale up to the sort of magnitude that Bigtable is designed to handle. You simply can't execute the sort of query that looks good on paper but is atrociously slow in an SQL database.

就如何改变表示数据的方式而言,最重要的是预先计算.不要在查询时进行连接,而是尽可能预先计算数据并将其存储在数据存储中.如果要选择随机记录,请生成一个随机数并将其与每条记录一起存储.有一整套关于此类提示和技巧的食谱这里.

In terms of how to change how you represent data, the most important thing is precalculation. Instead of doing joins at query time, precalculate data and store it in the datastore wherever possible. If you want to pick a random record, generate a random number and store it with each record. There's a whole cookbook of this sort of tips and tricks here.

这篇关于如何在数据存储而不是数据库中思考?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆