CQL SELECT大于索引非关键列上的查询 [英] CQL SELECT greater-than query on indexed non-key column

查看:84
本文介绍了CQL SELECT大于索引非关键列上的查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

EDIT1:添加一个案例来描述原始问题之后的问题。



我希望查询不是我的键的列。如果我理解正确,我需要在该列定义一个二级索引。但是,我希望使用一个大于条件(不仅仅是等于条件),并且仍然不受支持。



我错过了什么?
您将如何解决此问题?



我想要的设置:

  Cassandra 1.1.6 
CQL3

CREATE TABLE Table1(
KeyA int,
KeyB int,
ValueA int,
PRIMARY KEY(KeyA,KeyB)
);

CREATE INDEX ON Table1(ValueA);

SELECT * FROM Table1 WHERE ValueA> 3000;

由于使用复合键在ColumnFamilies上定义辅助索引仍然不支持Cassandra 1.1.6解决一个临时解决方案,删除一个键,但我仍然有同样的问题与不相等的条件。



有另一种方法来解决这个问题吗?



感谢您的时间。



相关来源:
http://cassandra.apache.org/doc/cql3/CQL.html#selectStmt
http://www.datastax .com / docs / 1.1 / ddl / indexes






EDIT1



这里有一个案例,将解释这个问题。正如rs-atl所指出的,它可能是一个数据模型问题。假设我在stackoverflow上保留所有用户的列族。对于每个用户我保留一批统计信息(名誉,NumOfAnswers,NumOfVotes ...所有的都是int)。我想查询这些统计信息以获取相关用户。

  CREATE TABLE UserStats(
UserID int,
声望int,
NumOfAnswers int,



很多stats ...



NumOfVotes int,
PRIMARY KEY(UserID)
);



现在我有兴趣根据这些统计信息切分UserID。我希望所有的用户拥有超过10K的声誉,我希望所有的用户拥有少于5个答案等。



我希望有所帮助。 在CQL中,您可以应用 WHERE 对所有列创建索引(即,辅助索引)后的子句。否则,您会收到以下错误:

 错误请求:没有索引列出现在by-columns子句中不幸的是,即使使用辅助索引,WHERE子句也需要在辅助索引上至少有一个EQ。由于效果问题导致的CQL。


问:为什么必须在
次要索引上始终至少有一个EQ比较?



A:辅助索引总是
在内存中完成,所以没有至少一个EQ在另一个辅助索引
你将加载数据库中的每一行,这与大量的
数据库不是一个好的理念。因此,通过在
索引上至少需要一个EQ,您希望将需要读入
内存的一组行限制为可管理的大小。 (虽然显然你仍然可以得到
麻烦了)。


需要注意的是,如果你在次要索引上有多个非EQ条件,您还需要在查询中包含 ALLOW FILTERING 关键字。



一种简单的解决方法是在表格中附加一个虚拟列,其中所有行在该列上具有相同的值。因此,在这种情况下,您可以对所需的列执行范围查询。






  cqlsh:demo> desc table table1; 

CREATE TABLE table1(
keya int,
keyb int,
dummyvalue int,
valuea int,
PRIMARY KEY )
)....

cqlsh:demo> select * from Table1;

keya | keyb | dummyvalue | valuea
------ + ------ + ------------ + --------
1 | 2 | 0 | 3
4 | 5 | 0 | 6
7 | 8 | 0 | 9

在ValueA和DummyValue上创建次要索引:

  cqlsh:demo>在table1(valuea)上创建索引table1_valuea; 
cqlsh:demo> create table table1_valueb on table1(dummyvalue);

ValueA 执行范围查询c $ c> DummyValue = 0

  cqlsh:demo& select * from table1 where dummyvalue = 0 and valuea> 5允许过滤; 

keya | keyb | dummyvalue | valuea
------ + ------ + ------------ + --------
4 | 5 | 0 | 6
7 | 8 | 0 | 9


EDIT1: added a case to describe the problem after the original question.

I wish to query on a column which is not part of my key. If I understand correctly, I need to define a secondary index on that column. However, I wish to use a greater than condition (not just equality condition) and that still seems unsupported.

Am I missing something? How would you address this issue?

My desired Setup:

Cassandra 1.1.6
CQL3

CREATE TABLE Table1(
             KeyA int,
             KeyB int,
             ValueA int,
             PRIMARY KEY (KeyA, KeyB)
           );

CREATE INDEX ON Table1 (ValueA);

SELECT * FROM Table1 WHERE ValueA > 3000;

Since defining a secondary index on ColumnFamilies with Composite Keys is still not supported in Cassandra 1.1.6 I have to settle on a temporary solution of dropping one of the keys but I still have the same problem with non equality conditions.

Is there another way to address this?

Thank you for your time.

Relevant sources: http://cassandra.apache.org/doc/cql3/CQL.html#selectStmt http://www.datastax.com/docs/1.1/ddl/indexes


EDIT1

Here's a case that will explain the problem. As rs-atl noted, it might be a data model problem. Let's say I keep a column family of all the users on stackoverflow. for each user I keep a batch of stats (Reputation, NumOfAnswers, NumOfVotes... all of them are int). I want to query on those stats to get the relevant users.

CREATE TABLE UserStats(
             UserID int,
             Reputation int,
             NumOfAnswers int,
             .
             .
             .
             A lot of stats...
             .
             .
             .
             NumOfVotes int,
             PRIMARY KEY (UserID)
           );

Now I'm interested in slicing UserID's based on those stats. I want all the users with over 10K reputation, I want all the users with less than 5 answers, etc. etc.

I hope that helps. Thanks again.

解决方案

In CQL, you are able to apply the WHERE clause on all columns once you have created indices for them (i.e., secondary index). Otherwise, you will get the following error:

Bad Request: No indexed columns present in by-columns clause with Equal operator

Unfortunately, even with secondary indices, the WHERE clause are required to have at least one EQ on an secondary index by CQL due to performance issue.

Q: Why is it necessary to always have at least one EQ comparison on secondary indices?

A: Inequalities on secondary indices are always done in memory, so without at least one EQ on another secondary index you will be loading every row in the database, which with a massive database isn't a good idea. So by requiring at least one EQ on an index, you hopefully limit the set of rows that need to be read into memory to a manageable size. (Although obviously you can still get into trouble with that as well).

One thing to note is that if you have more than one non EQ conditions on secondary indices, you also need to include the ALLOW FILTERING key word in your query.

One simple way to work-around is to append a dummy column to your table where all row have the same value on that column. So in this case you are able to perform ranged query on your desired column.


Example

cqlsh:demo> desc table table1;

CREATE TABLE table1 (
  keya int,
  keyb int,
  dummyvalue int,
  valuea int,
  PRIMARY KEY (keya, keyb)
) ....

cqlsh:demo> select * from Table1;

 keya | keyb | dummyvalue | valuea
------+------+------------+--------
    1 |    2 |          0 |      3
    4 |    5 |          0 |      6
    7 |    8 |          0 |      9

Create secondary indices on ValueA and DummyValue:

cqlsh:demo> create index table1_valuea on table1 (valuea);
cqlsh:demo> create index table1_valueb on table1 (dummyvalue);

Perform ranged query on ValueA with DummyValue=0:

cqlsh:demo> select * from table1 where dummyvalue = 0 and valuea > 5 allow filtering;

 keya | keyb | dummyvalue | valuea
------+------+------------+--------
    4 |    5 |          0 |      6
    7 |    8 |          0 |      9

这篇关于CQL SELECT大于索引非关键列上的查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆