时间比较的奇怪行为 [英] Strange behavior of timeuuid comparison

查看:218
本文介绍了时间比较的奇怪行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有Cassandra 2.x集群有3个节点和db方案像这样:

I have Cassandra 2.x cluster with 3 nodes and the db scheme like this:

cqlsh> CREATE KEYSPACE test_ks WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 3} AND durable_writes = true;
cqlsh> CREATE TABLE IF NOT EXISTS test_ks.test_cf (
   ...   time timeuuid,
   ...   user_id varchar,
   ...   info varchar,
   ...   PRIMARY KEY (time, user_id)
   ... ) WITH compression = {'sstable_compression': 'LZ4Compressor'} AND compaction = {'class': 'LeveledCompactionStrategy'};

允许添加一些数据(等待一段时间插入):

Lets add some data (wait some time betweeb inserts):

cqlsh> INSERT INTO test_ks.test_cf (time, user_id, info) VALUES (now(), 'user1', 'pythonista');
cqlsh> INSERT INTO test_ks.test_cf (time, user_id, info) VALUES (now(), 'user1', 'mr. Haskell');

让我们看看我们的数据:

Lets look at our data:

cqlsh> SELECT dateOf(time), user_id, info FROM test_ks.test_cf;

 dateOf(time)             | user_id | info
--------------------------+---------+-------------
 2014-06-24 16:00:31+0700 |   user1 | mr. Haskell
 2014-06-24 15:59:32+0700 |   user1 |  pythonista

(2 rows)



我在查询test_cf时得到一些奇怪的结果CF:

I get some strange results while querying test_cf CF:

cqlsh> SELECT dateOf(time) FROM test_ks.test_cf WHERE user_id='user1' AND token(time) >= token(maxTimeuuid('2014-06-24 16:00:31+0700')) ALLOW FILTERING;

 dateOf(time)
--------------------------
 2014-06-24 15:59:32+0700

(1 rows)

cqlsh> SELECT dateOf(time) FROM test_ks.test_cf WHERE user_id='user1' AND token(time) >= token(maxTimeuuid('2014-06-24 16:00:32+0700')) ALLOW FILTERING;

 dateOf(time)
--------------------------
 2014-06-24 15:59:32+0700

(1 rows)

cqlsh> SELECT dateOf(time) FROM test_ks.test_cf WHERE user_id='user1' AND token(time) >= token(maxTimeuuid('2014-06-24 16:00:33+0700')) ALLOW FILTERING;

 dateOf(time)
--------------------------
 2014-06-24 16:00:31+0700
 2014-06-24 15:59:32+0700

(2 rows)

正如你可以看到比较给出错误的结果,虽然 timeuuid 不能大于 maxTimeuuid ,并且必须大于'minTimeuuid'(对于同一个datetime,当然=)。有人可以向我解释这个奇怪的行为吗?

As you can see comparison gives the wrong results, although, timeuuid must not be greater than maxTimeuuid and must be greater than 'minTimeuuid' (for a same datetime, of course =) ). Could somebody explain to me this strange behavior?

TIA!

推荐答案

首先,你的主键是向后的 - 如果你想执行这样的查询,你的 CREATE TABLE 语句应该是这样:

First off, your primary key is backwards—if you want to perform queries like this, your CREATE TABLE statement should look like this:

CREATE TABLE IF NOT EXISTS test_ks.test_cf (
  time timeuuid,
  user_id varchar,
  info varchar,
  PRIMARY KEY (user_id, time)
) WITH compression = {'sstable_compression': 'LZ4Compressor'}
  AND compaction = {'class': 'LeveledCompactionStrategy'};

这使得 user_id code> time 适合您的查询模式的聚类列。

That makes user_id the partition key and time the clustering column, which fits your query pattern.

使用此更改,您不再需要使用 TOKEN 函数,这对于对分区键进行范围查询是必需的,但对于聚类列(并且通常不对应于列值的语义排序)是无意义的。由于时间现在是一个聚类列,您只需要进行一个正常的比较:

With this change, you no longer need to use the TOKEN function, which is needed to make range queries over partition keys, but is meaningless for clustering columns (and generally does not correspond to semantic ordering of column values). Since time is now a clustering column, you just need to do a normal comparison:

SELECT dateOf(time)
FROM test_ks.test_cf
WHERE user_id='user1'
  AND time >= maxTimeuuid('2014-06-24 16:00:31+0700');

您也可以删除 ALLOW FILTERING 因为您现在正在执行标准范围切片,而不是标记比较,这需要扫描所有行并进行比较。

You can also drop the ALLOW FILTERING, since you're now doing a standard range slice rather than the token comparison, which required scanning all the rows and making the comparison.

这篇关于时间比较的奇怪行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆