Cassandra CQL where子句有多个集合值? [英] Cassandra CQL where clause with multiple collection values?

查看:1306
本文介绍了Cassandra CQL where子句有多个集合值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据模型: -

  tid |码| raw |类型
------------------------------------- + ------- ------- + -------------- + ------
a64fdd60-1bc4-11e5-9b30-3dca08b6a366 | {12,34,53} | {sdafb = safd} | cmd

CREATE TABLE MyTable(
tid TIMEUUID,
type TEXT,
codes SET< INT>,
raw TEXT,
PRIMARY KEY (tid)
);
CREATE INDEX ON myTable(codes);

如何查询表格以根据多个设置值返回行。



这样工作: -

  select * from logData其中代码包含34; 

但我想基于多个设置值获取行,并且这些都不起作用: -

  select * from logData其中代码包含34,12;或
select * from logData其中代码包含34和12;或
select * from logData其中代码包含{34,12}​​;

请确认。

解决方案

如果我创建你的表结构并插入一个类似于上面的行,我可以检查 codes 集合中的多个值,如下所示: p>

  aploetz @ cqlsh:stackoverflow2> SELECT * FROM mytable 
WHERE代码包含34
AND代码包含12
ALLOW FILTERING;

tid |代码| raw |类型
-------------------------------------- + ------ -------- + -------------- + ------
2569f270-1c06-11e5-92f0-21b264d4c94d | {12,34,53} | {sdafb = safd} | cmd

(1 rows)

告诉你 为什么这是一个可怕的主意在集合上的二级索引(并且基数看起来相当高),每个节点将必须针对每个查询进行检查。 Cassandra的想法是通过分区键尽可能频繁地查询,这样你只需要在每个查询中命中一个节点。苹果的Richard Low写了一篇很棒的文章,名为 Cassandra二级索引的最佳位置



其次,我可以让Cassandra接受这个查询的唯一方法是使用允许过滤。这意味着,Cassandra可以应用所有适合标准(WHERE子句)的唯一方法是拉回每一行,并单独过滤不符合您的条件的行。效率低下。要清楚,ALLOW FILTERING指令是您不应该使用的。



在任何情况下,如果代码是您需要查询的内容,那么您应该设计一个附加的查询表,其中包含 codes 作为PRIMARY KEY的一部分。 p>

My data model:-

tid                                  | codes        | raw          | type
-------------------------------------+--------------+--------------+------
a64fdd60-1bc4-11e5-9b30-3dca08b6a366 | {12, 34, 53} | {sdafb=safd} |  cmd

CREATE TABLE MyTable (
tid       TIMEUUID,
type      TEXT,
codes     SET<INT>,
raw       TEXT,
PRIMARY KEY (tid)
);
CREATE INDEX ON myTable (codes);

How to query the table to return rows based on multiple set values.

This works:-

select * from logData where codes contains 34;

But i want to get row based on multiple set values and none of this works:-

select * from logData where codes contains 34, 12; or 
select * from logData where codes contains 34 and 12; or
select * from logData where codes contains {34, 12};

Kindly assit.

解决方案

If I create your table structure and insert a similar row to yours above, I can check for multiple values in the codes collection like this:

aploetz@cqlsh:stackoverflow2> SELECT * FROM mytable 
    WHERE codes CONTAINS 34 
      AND codes CONTAINS 12
      ALLOW FILTERING;

 tid                                  | codes        | raw          | type
--------------------------------------+--------------+--------------+------
 2569f270-1c06-11e5-92f0-21b264d4c94d | {12, 34, 53} | {sdafb=safd} |  cmd

(1 rows)

Now as others have mentioned, let me also tell you why this is a terrible idea...

With a secondary index on the collection (and with the cardinality appearing to be fairly high) every node will have to be checked for each query. The idea with Cassandra, is to query by partition key as often as possible, that way you only have to hit one node per query. Apple's Richard Low wrote a great article called The sweet spot for Cassandra secondary indexes. It should make you re-think the way you use secondary indexes.

Secondly, the only way I could get Cassandra to accept this query, was to use ALLOW FILTERING. What this means, is that the only way Cassandra can apply all of your fitlering criteria (WHERE clause) is to pull back every row and individually filter-out the rows that do not meet your criteria. Horribly inefficient. To be clear, the ALLOW FILTERING directive is something that you should never use.

In any case, if codes are something that you will need to query by, then you should design an additional query table with codes as a part of the PRIMARY KEY.

这篇关于Cassandra CQL where子句有多个集合值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆