不支持带有 2ndary 索引的 ORDER BY [英] ORDER BY with 2ndary indexes is not supported

查看:24
本文介绍了不支持带有 2ndary 索引的 ORDER BY的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用带有最新 CQL 的 cassandra 2.1.

这是我的桌子 &索引:

创建表 mydata.chats_new (id bigint,添加 bigint,来自demail文本,从电子邮件文本,来自jid文本,消息体文本,消息日期时间文本,messageid 文本,消息类型文本,Todemail 文本,电子邮件文本,tojid文本,PRIMARY KEY(messageid,messagedatetime));在 mydata.chats_new (fromjid) 上创建索引 user_fromJid;在 mydata.chats_new (tojid) 上创建索引 user_toJid;在 mydata.chats_new (adid) 上创建索引 user_adid;

当我执行此查询时:

select * from chats_new WHERE fromjid='test' AND toJid='test1' ORDER BY messagedatetime DESC;

我收到此错误:

"

code=2200 [Invalid query] message="不支持带有 2ndary 索引的 ORDER BY."

那么应该如何获取这些数据?

解决方案

select * from chats_new哪里 fromjid='test' AND toJid='test1'ORDER BY messagedatetime DESC;

<块引用>

code=2200 [Invalid query] message="不支持带有 2ndary 索引的 ORDER BY."

为了使该查询的 WHERE 子句起作用,我将构建一个特定的查询表,如下所示:

创建表 mydata.chats_new_by_fromjid_and_tojid (id bigint,添加 bigint,来自demail文本,从电子邮件文本,来自jid文本,消息体文本,消息日期时间文本,messageid 文本,消息类型文本,Todemail 文本,电子邮件文本,tojid文本,PRIMARY KEY((fromjid, tojid), messagedatetime, messageid));

注意主键定义.这会从 fromjidtojid 中创建一个分区键.虽然这将允许您对两个字段进行查询,但它也会要求在此表的所有查询中指定两个字段.但这就是他们将其称为查询表"的原因,因为它通常旨在为一个特定查询提供服务.

对于主键中的剩余字段,我将messagedatetime作为第一个聚类列,以确保磁盘上的排序顺序.Cassandra 中的默认排序是升序,因此如果您想在查询时更改它,这就是您的 ORDER BY messagedatetime DESC 发挥作用的地方.最后,我确保 messageid 是第二个聚类列,以帮助确保主键的唯一性(假设 messageid 是唯一的).

现在,此查询将起作用:

select * from chats_new_by_fromjid_and_tojid哪里 fromjid='test' AND toJid='test1'ORDER BY messagedatetime DESC;

如果您需要通过附加条件查询此数据,我强烈建议您创建附加查询表.请记住,Cassandra 最适合为它们服务的每个查询专门设计的表.复制几次数据就可以了,因为磁盘空间很便宜……操作时间不是.

此外,DataStax 有一篇很棒的文章关于 使用二级索引.绝对值得一读.

I am using cassandra 2.1 with latest CQL.

Here is my table & indexes:

CREATE TABLE mydata.chats_new (
    id bigint,
    adid bigint,
    fromdemail text,
    fromemail text,
    fromjid text,
    messagebody text,
    messagedatetime text,
    messageid text,
    messagetype text,
    todemail text,
    toemail text,
    tojid text,
    PRIMARY KEY(messageid,messagedatetime)
);



CREATE INDEX user_fromJid ON mydata.chats_new (fromjid);
CREATE INDEX user_toJid ON mydata.chats_new (tojid);
CREATE INDEX user_adid ON mydata.chats_new (adid);

When i execute this query:

select * from chats_new WHERE fromjid='test' AND toJid='test1' ORDER BY messagedatetime DESC;

I got this error:

code=2200 [Invalid query] message="ORDER BY with 2ndary indexes is not supported."

So how should fetch this data?

解决方案

select * from chats_new 
WHERE fromjid='test' AND toJid='test1' 
ORDER BY messagedatetime DESC;

code=2200 [Invalid query] message="ORDER BY with 2ndary indexes is not supported."

To get the WHERE clause of this query to work, I would build a specific query table, like this:

CREATE TABLE mydata.chats_new_by_fromjid_and_tojid (
    id bigint,
    adid bigint,
    fromdemail text,
    fromemail text,
    fromjid text,
    messagebody text,
    messagedatetime text,
    messageid text,
    messagetype text,
    todemail text,
    toemail text,
    tojid text,
    PRIMARY KEY((fromjid, tojid), messagedatetime, messageid)
);

Note the primary key definition. This creates a partitioning key out of fromjid and tojid. While this will allow you to query on both fields, it will also require both fields to be specified in all queries on this table. But that's why they call it a "query table", as it is generally designed to serve one particular query.

As for the remaining fields in the primary key, I kept messagedatetime as the first clustering column, to assure on-disk sort order. Default ordering in Cassandra is ascending, so if you want to change that at query time, that's where your ORDER BY messagedatetime DESC comes into play. And lastly, I made sure that the messageid was the second clustering column, to help ensure primary key uniqueness (assuming that messageid is unique).

Now, this query will work:

select * from chats_new_by_fromjid_and_tojid 
WHERE fromjid='test' AND toJid='test1'
ORDER BY messagedatetime DESC;

If you need to query this data by additional criteria, I highly recommend that you create additional query table(s). Remember, Cassandra works best with tables that are specifically designed for each query they serve. It's ok to replicate your data a few times, because disk space is cheap...operation time is not.

Also, DataStax has a great article on when not to use a secondary index. It's definitely worth a read.

这篇关于不支持带有 2ndary 索引的 ORDER BY的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆