在 Cassandra 中存储值列表 [英] Storing a list of values in Cassandra

查看:14
本文介绍了在 Cassandra 中存储值列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题的一些答案涉及旧版本的 Cassandra.此类问题的正确答案取决于您使用的 Cassandra 版本.

Some of the answers to this question deal with older versions of Cassandra. The correct answer for this kind of problem depends on the version of Cassandra you are using.

我有一个个人资料列族,想在每个个人资料中存储技能列表.我不确定这在 Cassandra 中通常是如何完成的.一种选择是存储序列化的 Thrift 或 protobuf,但我不希望这样做,因为我相信Cassandra 不了解这些格式,因此数据存储中的数据不会是人类无法从命令行通过 CQL 读取或查询的.我想到的另一个解决方案是使用超级列并将技能作为具有空值的键:

I have a profile column family and want to store a list of skills in each profile. I'm not sure how this is typically accomplished in Cassandra. One option would be to store a serialized Thrift or protobuf, but I'd prefer not to do this as I believe Cassandra doesn't have knowledge of these formats, and so the data in the datastore would not not human readable or queryable via CQL from the command line. The other solution I thought of would be to use a super column and put the skill as the key with a null value:

skills: {
  'java': '',
  'c++': '',
  'cobol': ''
}

这是在 Cassandra 中处理列表的好方法吗?我想有一些我不知道的成语.我正在使用 Astyanax 客户端库,它只支持复合列而不是超级列,所以我上面提出的解决方案在这种情况下看起来很尴尬.尽管我在理解复合列方面仍然存在一些问题,因为它们似乎尚未完全记录.此解决方案是否适用于复合列?

Is this a good way of handling lists in Cassandra? I imagine there's some idiom I'm not aware of. I'm using the Astyanax client library, which only supports composite columns instead of super columns, and so the solution I proposed above would seem quite awkward in that case. Though I'm still having some trouble understanding composite columns as they seem not to be completely documented yet. Would this solution work with composite columns?

推荐答案

此答案可追溯到 Cassandra 1.2 版发布之前,该版本为处理列表提供了截然不同的功能.如果您使用的是 Cassandra 1.2+,答案可能不合适.

This answer dates to before the release of Cassandra version 1.2, which provided substantially different functionality for handling lists. The answer might be inappropriate if you are using Cassandra 1.2+.

正如邮件列表中提到的,我的偏好对我来说效果很好,是存储一个单列技能",其值为序列化的 JSON 字符串.

As mentioned on the mailing list, my preference which has worked very well for me, is to store a single column "skills" with the value being a serialized JSON string.

实际上归结为您对技能"的使用模式.

Really comes down to the usage patterns you have for "skills".

  • 如果技能"仅适用于每个用户的 CRUD,这很好.
  • 如果您希望能够搜索具有cobol"技能的所有用户,那么我仍然会推荐这种方法,并有另一个,即具有一列的技能:cobolUUID 和时间戳的值或类似的东西......
  • 我确信将 Pig/Hadoop 集成到您的 cassandra 节点后,您仍然可以很高兴地查询所有具有 x、y 和 z 的用户以生成新数据以支持其他用例.
  • If "skills" are just for CRUD on a per user basis, this is fine.
  • If you want to be able to search for all users that have a skill of "cobol", then I would still recommend this approach and have another row that is skill:cobol that has a column of UUID and a value of timestamp or something similar ...
  • I'm sure with Pig/Hadoop integration to your cassandra nodes, you could also still quite happily query all of the users that have x,y and z to generate new data to support additional use cases.

这篇关于在 Cassandra 中存储值列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆