Cassandra:无法使用集群键(版本2.1.9)限制2列 [英] Cassandra: cannot restrict 2 columns using clustering key(version 2.1.9)

查看:94
本文介绍了Cassandra:无法使用集群键(版本2.1.9)限制2列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个与此非常相似的架构:-

I have a schema pretty similar to this:-

create table x(id int, start_date timestamp, end_date timestamp, 
primary key((id), start_date, end_date)) 
with clustering order by (start_date desc, end_date desc);

现在,我遇到了一个必须在开始日期和结束日期之间进行查询的问题。 -

Now I am stuck with a problem where I have to query between start date and end date. something like this : -

select count(*) from x where id=2 and start_date > 'date' and end_date < 'date' ;

但这给我一个类似于以下的错误:-

But it gives me an error similar to the following: -

InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column "end_date" 
cannot be restricted (preceding column "start_date" is restricted 
by a non-EQ relation)"

我是cassandra的新手,所有建议都是即使要求我们进行架构更改也很受欢迎。 :)

I am new to cassandra, any and all suggestions are welcomed even if it requires us to do a schema change. :)

推荐答案

您没有说要运行哪个版本的Cassandra,但在2.2及更高版本中,您可以群集列上的列切片限制。这可以接近您想要的。 CQL中的语法有点丑陋,但是基本上,您必须在指定所有聚类列的情况下指定起始范围,例如复合键。重要的是要考虑先按第一列对行进行排序,然后再对第二列对行进行排序。

You don't say which version of Cassandra you are running, but in 2.2 and later you can do multi-column slice restrictions on clustering columns. This can get close to what you want. The syntax in CQL is a little ugly, but basically you have to specify the starting range with all the clustering columns specified, like a compound key. It's important to think about the rows being sorted first by the first column, then within that sorted by the second column.

因此,假设我们有以下数据:

So assume we have this data:

SELECT * from x;

 id | start_date               | end_date
----+--------------------------+--------------------------
  2 | 2015-09-01 09:16:47+0000 | 2015-11-01 09:16:47+0000
  2 | 2015-08-01 09:16:47+0000 | 2015-10-01 09:16:47+0000
  2 | 2015-07-01 09:16:47+0000 | 2015-09-01 09:16:47+0000
  2 | 2015-06-01 09:16:47+0000 | 2015-10-01 09:16:47+0000

现在让我们根据两个日期进行选择:

Now let's select based on both dates:

SELECT * from x where id=2 
    and (start_date,end_date) >= ('2015-07-01 09:16:47+0000','2015-07-01 09:16:47+0000') 
    and (start_date,end_date) <= ('2015-09-01 09:16:47+0000','2015-09-01 09:16:47+0000');

 id | start_date               | end_date
----+--------------------------+--------------------------
  2 | 2015-08-01 09:16:47+0000 | 2015-10-01 09:16:47+0000
  2 | 2015-07-01 09:16:47+0000 | 2015-09-01 09:16:47+0000

现在您会注意到其中一个结束日期似乎晚于我们的限制,但并非如此。由于事物首先按start_date排序,因此您将获得所有具有匹配start_date的结束日期,因为它们在 compound range 的范围内。要摆脱类似的行,您可能需要在客户端进行一些过滤。

Now you'll notice that one of those end dates appears to be later than our restriction, but it isn't. Since things are sorted by start_date first, you'll get all the end dates with a matching start_date since they are in the range of the compound range. To get rid of rows like that you'll probably need to do a little filtering on the client side.

查看更多信息此处,在多列切片限制下。

See more information here, under "Multi-column slice restrictions".

这篇关于Cassandra:无法使用集群键(版本2.1.9)限制2列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆