用Cassandra进行单元版本控制 [英] Cell versioning with Cassandra

查看:137
本文介绍了用Cassandra进行单元版本控制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的应用程序为DAO层使用了AbstractFactory,所以一旦HBase DAO系列实现完成,对于我来说,创建Cassandra DAO系列并从几个角度看差异将会非常好。

无论如何,试图做到这一点,我看到Cassandra不支持像HBase这样的单元版本控制(我的应用程序强烈使用它),所以我想知道是否有一些表设计技巧(或其他)来模拟Cassandra中的这种行为

解决方案

一个常见的策略是使用包含两个组件的复合列名称:普通列名称和版。您使用的版本组件取决于您的访问模式。如果您可能同时有来自多个客户端的更新,那么使用TimeUUID是最安全的选择。如果一次只能更新一个客户端,则可以使用较小的内容,如时间戳或版本号。



假设您为简单起见使用版本号,以下是可能的情况看起来像存储带版本字段的文档: ('body',5)| ('body',4)| ... | ('title',1)| ('title',0)|
| ------------- | ------------- | ----- | ----------- --- | -------------- |
| 'Neque ...'| 'Dolor ...'| ... | 'Lorem Ipsum'| '我的文档'|

如果您想要特定版本的字段,字段的所有版本,或所有字段的所有版本。



如果您还希望支持有效地一次获取所有字段的最新版本,我建议您反规范化并添加第二个列族只有每个字段的最新版本以其正常形式存储。您可以盲目地覆盖每个更改的这些字段。继续我们的例子,这个专栏家庭看起来像:

  | 'body'| 'title'| 
| ------------- | --------------- |
| 'Neque ...'| 'Lorem Ipsum'|


My application uses an AbstractFactory for the DAO layer so once the HBase DAO family has been implemented, It would be very great for me to create the Cassandra DAO family and see the differences from several points of view.
Anyway, trying to do that, I saw Cassandra doesn't support cell versioning like HBase (and my application makes a strong usage of that) so I was wondering if there are some table design trick (or something else) to "emulate" this behaviour in Cassandra

解决方案

One common strategy is to use composite column names with two components: the normal column name, and a version. What you use for the version component depends on your access patterns. If you might have updates coming from multiple clients simultaneously, then using a TimeUUID is your safest option. If only one client may update at a time, you can use something smaller, like a timestamp or version number.

Assuming you use version numbers for simplicity, here's what that might look like for storing documents with versioned fields:

| ('body', 5) | ('body', 4) | ... | ('title', 1) | ('title', 0) |
|-------------|-------------|-----|--------------|--------------|
| 'Neque ...' | 'Dolor ...' | ... | 'Lorem Ipsum'| 'My Document'|

This format is primarily useful if you want a specific version of a field, all versions of a field, or all versions of all fields.

If you also want to support efficiently fetching the latest version of all fields at once, I suggest you denormalize and add a second column family where only the latest version of each field is store in its normal form. You can blindly overwrite these fields for each change. Continuing our example, this column family would look like:

|   'body'    |    'title'    |
|-------------|---------------|
| 'Neque ...' | 'Lorem Ipsum' |

这篇关于用Cassandra进行单元版本控制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆