Cassandra 中分区键、复合键和集群键的区别? [英] Difference between partition key, composite key and clustering key in Cassandra?

查看:31
本文介绍了Cassandra 中分区键、复合键和集群键的区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在网上阅读文章以了解以下 key 类型之间的差异.但这对我来说似乎很难掌握.示例肯定有助于更好地理解.

I have been reading articles around the net to understand the differences between the following key types. But it just seems hard for me to grasp. Examples will definitely help make understanding better.

primary key,
partition key, 
composite key 
clustering key

推荐答案

围绕这个有很多困惑,我会尽量简化.

There is a lot of confusion around this, I will try to make it as simple as possible.

主键是一个通用概念,表示用于从表中检索数据的一个或多个列.

The primary key is a general concept to indicate one or more columns used to retrieve data from a Table.

主键可能是SIMPLE,甚至可以声明为内联:

The primary key may be SIMPLE and even declared inline:

 create table stackoverflow_simple (
      key text PRIMARY KEY,
      data text      
  );

这意味着它由单个列组成.

That means that it is made by a single column.

但主键也可以是 COMPOSITE(又名 COMPOUND),从更多列生成.

But the primary key can also be COMPOSITE (aka COMPOUND), generated from more columns.

 create table stackoverflow_composite (
      key_part_one text,
      key_part_two int,
      data text,
      PRIMARY KEY(key_part_one, key_part_two)      
  );

COMPOSITE 主键的情况下,第一部分"键的一部分称为 PARTITION KEY(在本例中 key_part_one 是分区键),键的第二部分是 CLUSTERING KEY(在本例中为key_part_two)

In a situation of COMPOSITE primary key, the "first part" of the key is called PARTITION KEY (in this example key_part_one is the partition key) and the second part of the key is the CLUSTERING KEY (in this example key_part_two)

请注意,分区和聚簇键可以由更多列组成,方法如下:

 create table stackoverflow_multiple (
      k_part_one text,
      k_part_two int,
      k_clust_one text,
      k_clust_two int,
      k_clust_three uuid,
      data text,
      PRIMARY KEY((k_part_one, k_part_two), k_clust_one, k_clust_two, k_clust_three)      
  );

在这些名字的背后......

Behind these names ...

  • 分区键负责跨节点的数据分布.
  • Clustering Key 负责分区内的数据排序.
  • 主键相当于单字段键表(即简单)中的分区键.
  • 复合/复合键就是任意多列键
  • The Partition Key is responsible for data distribution across your nodes.
  • The Clustering Key is responsible for data sorting within the partition.
  • The Primary Key is equivalent to the Partition Key in a single-field-key table (i.e. Simple).
  • The Composite/Compound Key is just any multiple-column key

更多使用信息:DATASTAX DOCUMENTATION<小时>小用法和内容示例
***简单*** 关键:

Further usage information: DATASTAX DOCUMENTATION

insert into stackoverflow_simple (key, data) VALUES ('han', 'solo');
select * from stackoverflow_simple where key='han';

表格内容

key | data
----+------
han | solo

COMPOSITE/COMPOUND KEY 可以检索宽行";(即,即使您定义了集群键,您也可以仅通过分区键进行查询)

COMPOSITE/COMPOUND KEY can retrieve "wide rows" (i.e. you can query by just the partition key, even if you have clustering keys defined)

insert into stackoverflow_composite (key_part_one, key_part_two, data) VALUES ('ronaldo', 9, 'football player');
insert into stackoverflow_composite (key_part_one, key_part_two, data) VALUES ('ronaldo', 10, 'ex-football player');
select * from stackoverflow_composite where key_part_one = 'ronaldo';

表格内容

 key_part_one | key_part_two | data
--------------+--------------+--------------------
      ronaldo |            9 |    football player
      ronaldo |           10 | ex-football player

但是您可以使用所有键(分区和聚类)进行查询...

But you can query with all key (both partition and clustering) ...

select * from stackoverflow_composite 
   where key_part_one = 'ronaldo' and key_part_two  = 10;

查询输出

 key_part_one | key_part_two | data
--------------+--------------+--------------------
      ronaldo |           10 | ex-football player

重要说明:分区键是使用 where 子句 执行查询所需的最小说明符.如果你有一个复合分区键,像下面这样

Important note: the partition key is the minimum-specifier needed to perform a query using a where clause. If you have a composite partition key, like the following

例如:PRIMARY KEY((col1, col2), col10, col4))

您只能通过至少传递 col1 和 col2 来执行查询,这些是定义分区键的 2 列.一般"进行查询的规则是您必须至少传递所有分区键列,然后您可以按照设置的顺序选择添加每个集群键.

You can perform query only by passing at least both col1 and col2, these are the 2 columns that define the partition key. The "general" rule to make query is you have to pass at least all partition key columns, then you can add optionally each clustering key in the order they're set.

所以有效的查询是(排除二级索引)

  • col1 和 col2
  • col1 和 col2 和 col10
  • col1 和 col2 以及 col10 和 col 4

无效:

  • col1 和 col2 和 col4
  • 任何不包含 col1 和 col2 的内容

这篇关于Cassandra 中分区键、复合键和集群键的区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆