Cassandra 中的集群键 [英] Clustering Keys in Cassandra

查看:28
本文介绍了Cassandra 中的集群键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在给定的物理节点上,给定分区键的行按照由集群键诱导的顺序存储,使得以该集群顺序检索行特别有效.http://cassandra.apache.org/doc/cql3/CQL.html#createTableStmt 什么排序是由聚类键引起的吗?

On a given physical node, rows for a given partition key are stored in the order induced by the clustering keys, making the retrieval of rows in that clustering order particularly efficient. http://cassandra.apache.org/doc/cql3/CQL.html#createTableStmt What kind of ordering is induced by clustering keys?

推荐答案

假设你的聚类键是

k1 t1, k2 t2, ..., kn tn

其中 ki 是第 i 个键名,ti 是第 i 个键类型.然后订单数据以字典顺序存储,其中每个维度都使用该类型的比较器进行比较.

where ki is the ith key name and ti is the ith key type. Then the order data is stored in is lexicographic ordering where each dimension is compared using the comparator for that type.

所以 (a1, a2, ..., an) <(b1, b2, ..., bn) 如果 a1 <b1 使用 t1 比较器,或 a1=b1 且 a2 <b2 使用 t2 比较器,或者 (a1=b1 and a2=b2) and a3 <b3 使用 t3 比较器等.

So (a1, a2, ..., an) < (b1, b2, ..., bn) if a1 < b1 using t1 comparator, or a1=b1 and a2 < b2 using t2 comparator, or (a1=b1 and a2=b2) and a3 < b3 using t3 comparator, etc..

这意味着找到具有特定 k1=a 的所有行是有效的,因为数据存储在一起.但是在 i > 1 时查找所有具有 ki=x 的行是低效的.事实上,这样的查询是不允许的 - 唯一允许的集群键约束指定零个或多个集群键,从第一个开始,没有丢失.

This means that it is efficient to find all rows with a certain k1=a, since the data is stored together. But it is inefficient to find all rows with ki=x for i > 1. In fact, such a query isn't allowed - the only clustering key constraints that are allowed specify zero or more clustering keys, starting from the first with none missing.

例如,考虑架构

create table clustering (
    x text,
    k1 text,
    k2 int,
    k3 timestamp,
    y text,
    primary key (x, k1, k2, k3)
);

如果你做了以下插入:

insert into clustering (x, k1, k2, k3, y) values ('x', 'a', 1, '2013-09-10 14:00+0000', '1');
insert into clustering (x, k1, k2, k3, y) values ('x', 'b', 1, '2013-09-10 13:00+0000', '1');
insert into clustering (x, k1, k2, k3, y) values ('x', 'a', 2, '2013-09-10 13:00+0000', '1');
insert into clustering (x, k1, k2, k3, y) values ('x', 'b', 1, '2013-09-10 14:00+0000', '1');

然后它们按此顺序存储在磁盘上(select * from clustering where x = 'x' 返回的顺序):

then they are stored in this order on disk (the order select * from clustering where x = 'x' returns):

 x | k1 | k2 | k3                       | y
---+----+----+--------------------------+---
 x |  a |  1 | 2013-09-10 14:00:00+0000 | 1
 x |  a |  2 | 2013-09-10 13:00:00+0000 | 1
 x |  b |  1 | 2013-09-10 13:00:00+0000 | 1
 x |  b |  1 | 2013-09-10 14:00:00+0000 | 1

k1 排序占主导地位,然后是 k2,然后是 k3.

k1 ordering dominates, then k2, then k3.

这篇关于Cassandra 中的集群键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆