Cassandra:无法使用集群键(版本2.1.9)限制2列 [英] Cassandra: cannot restrict 2 columns using clustering key(version 2.1.9)
问题描述
我有一个与此非常相似的架构:-
I have a schema pretty similar to this:-
create table x(id int, start_date timestamp, end_date timestamp,
primary key((id), start_date, end_date))
with clustering order by (start_date desc, end_date desc);
现在,我遇到了一个必须在开始日期和结束日期之间进行查询的问题。 -
Now I am stuck with a problem where I have to query between start date and end date. something like this : -
select count(*) from x where id=2 and start_date > 'date' and end_date < 'date' ;
但这给我一个类似于以下的错误:-
But it gives me an error similar to the following: -
InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column "end_date"
cannot be restricted (preceding column "start_date" is restricted
by a non-EQ relation)"
我是cassandra的新手,所有建议都是即使要求我们进行架构更改也很受欢迎。 :)
I am new to cassandra, any and all suggestions are welcomed even if it requires us to do a schema change. :)
推荐答案
您没有说要运行哪个版本的Cassandra,但在2.2及更高版本中,您可以群集列上的列切片限制。这可以接近您想要的。 CQL中的语法有点丑陋,但是基本上,您必须在指定所有聚类列的情况下指定起始范围,例如复合键。重要的是要考虑先按第一列对行进行排序,然后再对第二列对行进行排序。
You don't say which version of Cassandra you are running, but in 2.2 and later you can do multi-column slice restrictions on clustering columns. This can get close to what you want. The syntax in CQL is a little ugly, but basically you have to specify the starting range with all the clustering columns specified, like a compound key. It's important to think about the rows being sorted first by the first column, then within that sorted by the second column.
因此,假设我们有以下数据:
So assume we have this data:
SELECT * from x;
id | start_date | end_date
----+--------------------------+--------------------------
2 | 2015-09-01 09:16:47+0000 | 2015-11-01 09:16:47+0000
2 | 2015-08-01 09:16:47+0000 | 2015-10-01 09:16:47+0000
2 | 2015-07-01 09:16:47+0000 | 2015-09-01 09:16:47+0000
2 | 2015-06-01 09:16:47+0000 | 2015-10-01 09:16:47+0000
现在让我们根据两个日期进行选择:
Now let's select based on both dates:
SELECT * from x where id=2
and (start_date,end_date) >= ('2015-07-01 09:16:47+0000','2015-07-01 09:16:47+0000')
and (start_date,end_date) <= ('2015-09-01 09:16:47+0000','2015-09-01 09:16:47+0000');
id | start_date | end_date
----+--------------------------+--------------------------
2 | 2015-08-01 09:16:47+0000 | 2015-10-01 09:16:47+0000
2 | 2015-07-01 09:16:47+0000 | 2015-09-01 09:16:47+0000
现在您会注意到其中一个结束日期似乎晚于我们的限制,但并非如此。由于事物首先按start_date排序,因此您将获得所有具有匹配start_date的结束日期,因为它们在 compound range 的范围内。要摆脱类似的行,您可能需要在客户端进行一些过滤。
Now you'll notice that one of those end dates appears to be later than our restriction, but it isn't. Since things are sorted by start_date first, you'll get all the end dates with a matching start_date since they are in the range of the compound range. To get rid of rows like that you'll probably need to do a little filtering on the client side.
查看更多信息此处,在多列切片限制下。
See more information here, under "Multi-column slice restrictions".
这篇关于Cassandra:无法使用集群键(版本2.1.9)限制2列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!