选择Cassandra表的分区键 - 有多少分区? [英] Choosing a partition key for a Cassandra table -- how many is too many partitions?

查看:1872
本文介绍了选择Cassandra表的分区键 - 有多少分区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个应用程序,其中一个Cassandra表的'自然'分区键似乎是'客户'。这是我们想要查询数据的主要方式,我们会得到良好的数据分布等。

I have an application where the 'natural' partition key for a Cassandra table seems like it would be 'customer'. This is the primary way we want to query the data, we would get good data distribution, etc.

但是如果有超过100万的客户,那是太许多不同的分区?

But if there were well over 1 million customers, would that be too many different partitions?

我应该选择导致分区键数量较少的分区键吗?

Should I choose a partition key that results in a smaller number of partition keys?

推荐答案


但是,如果有超过100万客户,那么是否会有太多的不同分区?

But if there were well over 1 million customers, would that be too many different partitions?

否。 Murmur3Partitioner可以处理像2 ^ 64(-2 ^ 63到+ 2 ^ 63)分区。 Cassandra被设计为非常适合存储大量数据和通过分区键检索。对分区(20亿)内的列数有限制,但是对于总分区数,我认为你会很满意你所拥有的。

No. The Murmur3Partitioner can handle something like 2^64 (-2^63 to +2^63) partitions. Cassandra is designed to be very good at storing large amounts of data and retrieving by partition key. There are restrictions on the number of columns within a partition (2 billion), but for total number of partitions I think you'll be fine with what you have.


我应该选择导致分区键数量较少的分区键吗?

Should I choose a partition key that results in a smaller number of partition keys?

绝对不是。这可能会导致您的分区变得太大,和/或在您的集群中开发热点。

Definitely not. That could cause your partitions to grow too big, and/or develop "hot spots" in your cluster.

选择一个好的分区键后面的主要任务是找到(两者)在集群中提供良好的数据分布,并匹配您的查询模式。从我正在读的,听起来你已经做到了这一点。

The main task behind picking a good partition key, is to find one that (both) offers good data distribution in the cluster, and matches your query patterns. And from what I'm reading, it sounds like you have done exactly that.

这篇关于选择Cassandra表的分区键 - 有多少分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆