Cassandra CQL选择查询不返回具有时间戳为clusterkey的记录 [英] Cassandra CQL select query not returning records which have timestamp as clusterkey

查看:280
本文介绍了Cassandra CQL选择查询不返回具有时间戳为clusterkey的记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Cassandra CQL:使用复合键和集群键创建的表。当我尝试从分区键执行select *时,我能够检索所有数据,它也适用于关系运算符(<或>)。
但是当我查询特定的集群键使用等于(=)运算符与适当的值,它返回0行。



表格

  CREATE TABLE entity_data(
received_date timestamp,
entity text,
received_time timestamp,
node int,
primary key((received_date,entity),received_time)

资料(选择*来自实体):
$ b

  received_date |实体| received_time | node_id 
2014-09-24 00:00:00 + 0400 | NA | 2014-09-24 18:56:55 + 0400 | 0 |



带有条件查询 - 此处无效

  select * from entity_data 
where received_date ='2014-09-24 00:00:00 + 0400'and entity =' NA'
and received_time ='2014-09-24 18:56:55 + 0400';
(0 rows)

- 返回0行。

解决方案

我看到发生了什么。您正在使用 now()生成时间UUID。但是当你使用 dateOf()将它转换为 timestamp 时,会截断它的毫秒数。因此,查询 received_time 等于2014-09-24 18:56:55 + 0400将不会产生任何结果,因为 timestamp type仍然以毫秒存储(由于您的 dateOf()),您只能看不到它。



最好的办法是将你的时间存储为 timeuuid (注意:我留下 received_date 作为仅用于示例目的的时间戳)。然后在 SELECT 时使用 dateOf ,并使用 minTimeuuid() 子句

$ b> $ <$ p> $

  CREATE TABLE entity_data2 
received_date timestamp,
实体文本,
received_time timeuuid,
node int,
PRIMARY KEY((received_date,entity),received_time)

INSERT INTO entity_data2(received_date,entity,received_time,node)
VALUES('2014-09-24 00:00:00 + 0400','NA',now );

aploetz @ cqlsh:stackoverflow> SELECT * FROM entity_data2
WHERE received_date ='2014-09-24 00:00:00 + 0400'AND entity ='NA'
AND received_time> minTimeuuid('2014-10-08 08:13: 53-0500')
AND received_time< minTimeuuid('2014-10-08 08:13:54-0500');

received_date |实体| received_time |节点
-------------------------- + -------- + --------- ----------------------------- + ------
2014-09-23 15:00:00 -0500 | NA | f3b548b0-4eec-11e4-9d05-7991a041665c | 0

(1 rows)

aploetz @ cqlsh:stackoverflow> SELECT received_date,entity,dateof(received_time),node
FROM entity_data2 WHERE received_date ='2014-09-24 00:00:00 + 0400'AND entity ='NA'
AND received_time> minTimeuuid 2014-10-08 08:13:53-0500')
AND received_time< minTimeuuid('2014-10-08 08:13:54-0500');

received_date |实体| dateof(received_time)|节点
-------------------------- + -------- + --------- ----------------- + ------
2014-09-23 15:00:00-0500 | NA | 2014-10-08 08:13:53-0500 | 0

(1 rows)

基本上 dateOf()函数旨在用于查询数据,而不是存储它。这是一个博客文章,描述(更详细地说明)如何使这项工作:



Cassandra 1.2+和CQL3中基于时间序列的查询


Cassandra CQL: Table created with composite key and cluster key. When I try to execute select * from partition key then I able to retrieve all data and it works for relational operator ( < or > ) too . But when I queried for particular cluster key using equal-to(=) operator with proper value it returns 0 rows.

Table:

CREATE TABLE entity_data (
received_date timestamp,
entity text,
received_time timestamp,
node int,
primary key ((received_date ,entity),received_time));

Data ( select * from entity):

received_date              | entity | received_time            | node_id
2014-09-24 00:00:00+0400   |     NA | 2014-09-24 18:56:55+0400 |       0  | 

with Conditional Query: -- here it does not work

select * from entity_data 
where received_date = '2014-09-24 00:00:00+0400' and entity = 'NA' 
and received_time='2014-09-24 18:56:55+0400';
(0 rows)

-- it returns 0 rows.

解决方案

I see what is going on. You are using now() to generate a time-UUID. But when you convert that to a timestamp using dateOf() you are truncating the milliseconds off of it. Therefore querying for a received_time equal to 2014-09-24 18:56:55+0400 will yield nothing, as the timestamp type is still stored with the milliseconds (you just can't see it due to your dateOf()).

The best way to go about this, is to store your times as timeuuids (NOTE: I left received_date as a timestamp just for purposes of the example). Then use the dateOf when you SELECT, and use the minTimeuuid() function for your WHERE clause:

CREATE TABLE entity_data2 (
    received_date timestamp,
    entity text,
    received_time timeuuid,
    node int,
PRIMARY KEY ((received_date, entity), received_time));

INSERT INTO entity_data2 (received_date, entity, received_time , node) 
VALUES ('2014-09-24 00:00:00+0400','NA',now(),0);

aploetz@cqlsh:stackoverflow> SELECT * FROM entity_data2 
    WHERE received_date = '2014-09-24 00:00:00+0400' AND entity = 'NA'  
    AND received_time>minTimeuuid('2014-10-08 08:13:53-0500') 
    AND received_time<minTimeuuid('2014-10-08 08:13:54-0500');

 received_date            | entity | received_time                        | node
--------------------------+--------+--------------------------------------+------
 2014-09-23 15:00:00-0500 |     NA | f3b548b0-4eec-11e4-9d05-7991a041665c |    0

(1 rows)

aploetz@cqlsh:stackoverflow> SELECT received_date, entity, dateof(received_time), node 
    FROM entity_data2 WHERE received_date = '2014-09-24 00:00:00+0400' AND entity = 'NA'
    AND received_time>minTimeuuid('2014-10-08 08:13:53-0500') 
    AND received_time<minTimeuuid('2014-10-08 08:13:54-0500');

 received_date            | entity | dateof(received_time)    | node
--------------------------+--------+--------------------------+------
 2014-09-23 15:00:00-0500 |     NA | 2014-10-08 08:13:53-0500 |    0

(1 rows)

Basically the dateOf() function was designed to be used for querying data, not storing it. Here is a blog posting that describes (in more detail) how to make this work:

Time series based queries in Cassandra 1.2+ and CQL3

这篇关于Cassandra CQL选择查询不返回具有时间戳为clusterkey的记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆