卡桑德拉UUID作为行键 [英] Cassandra uuid as row key

查看:172
本文介绍了卡桑德拉UUID作为行键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么卡桑德拉键通常被定义为UUID。貌似在客户端生成密钥,那么为什么不直接存储为字符串?什么是作为UUID专门存储的好处?

Why in Cassandra keys usually defined as UUID. Looks like the key is generated on client side, so why not just store as string? What's benefit of storing specifically as UUID?

推荐答案

一个可能与卡桑德拉任意键,一键是字节组反正。如果客户希望有一个像foobar的或任意长度的任何其他字符串键,没有什么不妥的地方。卡桑德拉客户端将其转换为入字节传输卡桑德拉服务器之前的数组。从技术上讲它会被存储在服务器端FOOBAR。

One might have any key with Cassandra, a key is a bytearray anyway. If clients wants to have key like "foobar" or any other string of arbitrary length, there is nothing wrong with it. Cassandra client converts it into into an array of bytes before transmission to Cassandra server. Technically it will be stored as "foobar" on the server side.

有其他的事情之一需要考虑的关键格式作出决定时:

There are other things one need to consider when deciding on key format:


  • 密钥长度,对Cassandra的性能直接影响。让他们尽可能短,是合理的,这样他们仍然可以为所需的数据访问非常有用。短键是无用的数据访问是不是比具有更好的get /扫描性能的更长的密钥更好。按键设计时预期的权衡。如果你有长串钥匙,这可能是一个好主意,他们散列到的UUID。

  • 请注意,您可以存储UUID作为人类可读的字符串,具有UUID像'f5606950-98d1-11e3-a5e2-0800200c9a66,但一个更好的方式的想法是使用,只需占用16个字节来存储它内部的数据类型。

  • 您需要作出决定是否使用有序preservingPartitioner或RandomPartitioner 前期,有取舍的数量,但什么是最重要的是它将如何影响整个群集密钥分发。人们通常去与有序preservingPartitioner,因为它允许有有意义的扫描,这取决于他们这通常会导致热/冷卡桑德拉节点键值。为了帮助一个再次,无论是使用原始密钥的哈希 - UUID或prePEND一些UUID的真正的关键 -

  • 你打算如何来访问你的钥匙,这从简单的变 GET ,到,过于忽略删除,人们往往发现,UUID是一个很好的折衷

  • 你计划如何进行负载均衡数据

  • Key length has direct impact on Cassandra performance. Keep them as short as is reasonable such that they can still be useful for required data access. A short key that is useless for data access is not better than a longer key with better get/scan properties. Expect tradeoffs when designing keys. If you have long strings as keys, it might be a good idea to hash them into UUIDs.
  • Note that you can store UUID as human readable string which has UUID like 'f5606950-98d1-11e3-a5e2-0800200c9a66' but a way better idea is to use internal datatype that just uses 16 bytes to store it.
  • You need to make a decision whether to use the OrderedPreservingPartitioner or RandomPartitioner upfront, there are number of trade-offs, but what is most important is how it will affect key distribution across the cluster. One typically goes with OrderedPreservingPartitioner as it allows to have meaningful scans, depending on they key values it typically leads to hot/cold Cassandra Nodes. To help with that one again, either uses hash of the original key - UUID or prepend a real key with some UUID - .
  • How do you plan to access your keys, this goes from simple get, to slice and overly ignored delete, often people find that UUID is a good compromise
  • How do you plan to load-balance your data

这篇关于卡桑德拉UUID作为行键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆