Cassandra:使用包含大于或小于 (< 和 >) 的 where 子句进行查询 [英] Cassandra: Query with where clause containing greather- or lesser-than (< and >)

查看:23
本文介绍了Cassandra:使用包含大于或小于 (< 和 >) 的 where 子句进行查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Cassandra 1.1.2 我正在尝试将 RDBMS 应用程序转换为 Cassandra.在我的 RDBMS 应用程序中,我有以下名为 table1 的表:

I'm using Cassandra 1.1.2 I'm trying to convert a RDBMS application to Cassandra. In my RDBMS application I have following table called table1:

| Col1 | Col2 | Col3 | Col4 |

  1. Col1:字符串(主键)
  2. Col2:字符串(主键)
  3. Col3:Bigint(索引)
  4. Col4:Bigint

该表包含超过 2 亿条记录.最常用的查询是这样的:

This table counts over 200 million records. Mostly used query is something like:

Select * from table where col3 < 100 and col3 > 50;

在 Cassandra 中,我使用以下语句来创建表:

In Cassandra I used following statement to create the table:

create table table1 (primary_key varchar, col1 varchar, 
col2 varchar, col3 bigint, col4 bigint, primary key (primary_key));

create index on table1(col3);

我将主键更改为一个额外的列(我计算了应用程序中的键).导入几条记录后,我尝试执行以下 cql:

I changed the primary key to an extra column (I calculate the key inside my application). After importing a few records I tried to execute following cql:

select * from table1 where col3 < 100 and col3 > 50;

这个结果是:

Bad Request: No indexed columns present in by-columns clause with Equal operator

查询select col1,col2,col3,col4 from table1 where col3 = 67 工作

Google 表示无法执行此类查询.是对的吗?有什么建议可以创建这样的查询吗?

Google said there is no way to execute that kind of queries. Is that right? Any advice how to create such a query?

推荐答案

Cassandra 索引实际上并不支持顺序访问;参见 http://www.datastax.com/docs/1.1/ddl/indexes 快速解释它们的用处.但不要绝望;使用 Cassandra(以及许多其他 NoSQL 系统)的更经典的方法是反规范化、反规范化、反规范化.

Cassandra indexes don't actually support sequential access; see http://www.datastax.com/docs/1.1/ddl/indexes for a good quick explanation of where they are useful. But don't despair; the more classical way of using Cassandra (and many other NoSQL systems) is to denormalize, denormalize, denormalize.

在您的情况下,使用经典的bucket-range 模式是一个好主意,它允许您使用推荐的 RandomPartitioner 并保持您的行在您的集群中良好分布,同时仍然允许顺序访问你的价值观.在这种情况下的想法是,您将第二个动态列族映射(分桶和有序)col3 值返回到相关的 primary_key 值.例如,如果您的 col3 值范围从 0 到 10^9 并且分布相当均匀,您可能希望将它们放入 1000 个范围为 10^6 的桶中(最好的粒度级别将取决于您需要的查询类型、您拥有的数据类型、查询往返时间等).cql3 的示例架构:

It may be a good idea in your case to use the classic bucket-range pattern, which lets you use the recommended RandomPartitioner and keep your rows well distributed around your cluster, while still allowing sequential access to your values. The idea in this case is that you would make a second dynamic columnfamily mapping (bucketed and ordered) col3 values back to the related primary_key values. As an example, if your col3 values range from 0 to 10^9 and are fairly evenly distributed, you might want to put them in 1000 buckets of range 10^6 each (the best level of granularity will depend on the sort of queries you need, the sort of data you have, query round-trip time, etc). Example schema for cql3:

CREATE TABLE indexotron (
    rangestart int,
    col3val int,
    table1key varchar,
    PRIMARY KEY (rangestart, col3val, table1key)
);

插入table1时,应在indexotron中插入对应的行,rangestart = int(col3val/1000000).那么当你需要用col3>X枚举table1中的所有行时,你需要查询最多1000个indexotron的bucket,但是所有的col3vals 内将被排序.查找所有 table1.primary_key 值的示例查询,其中 table1.col3 <4021:

When inserting into table1, you should insert a corresponding row in indexotron, with rangestart = int(col3val / 1000000). Then when you need to enumerate all rows in table1 with col3 > X, you need to query up to 1000 buckets of indexotron, but all the col3vals within will be sorted. Example query to find all table1.primary_key values for which table1.col3 < 4021:

SELECT * FROM indexotron WHERE rangestart = 0 ORDER BY col3val;
SELECT * FROM indexotron WHERE rangestart = 1000 ORDER BY col3val;
SELECT * FROM indexotron WHERE rangestart = 2000 ORDER BY col3val;
SELECT * FROM indexotron WHERE rangestart = 3000 ORDER BY col3val;
SELECT * FROM indexotron WHERE rangestart = 4000 AND col3val < 4021 ORDER BY col3val;

这篇关于Cassandra:使用包含大于或小于 (&lt; 和 &gt;) 的 where 子句进行查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆