cassandra:排序问题,排序错误 [英] cassandra:sorting problem,ordering is wrong
问题描述
我对Cassandra有疑问。目前,通过column1排序在18位uuid上可以使用 entities_by_time,但在升至19位排序时uuid出现了问题。请帮助我。
I have a question about Cassandra. At present, "entities_by_time" is ok on the 18-bit uuid through column1 sorting, but there is something wrong with uuid ascending to the 19-bit sorting. Please help me.
cqlsh:minds> select * from entities_by_time where key='activity:user:990192934408163330' order by column1 desc limit 10;
key | column1 | value
----------------------------------+--------------------+--------------------
activity:user:990192934408163330 | 999979571363188746 | 999979571363188746
activity:user:990192934408163330 | 999979567064027139 | 999979567064027139
activity:user:990192934408163330 | 999979562764865555 | 999979562764865555
activity:user:990192934408163330 | 999979558465703953 | 999979558465703953
activity:user:990192934408163330 | 999979554170736649 | 999979554170736649
activity:user:990192934408163330 | 999979549871575047 | 999979549871575047
activity:user:990192934408163330 | 999979545576607752 | 999979545576607752
activity:user:990192934408163330 | 999979541290029073 | 999979541290029073
activity:user:990192934408163330 | 999979536990867461 | 999979536990867461
activity:user:990192934408163330 | 999979532700094475 | 999979532700094475
cqlsh:minds> select * from entities_by_time where key='activity:user:990192934408163330' order by column1 asc limit 10;
key | column1 | value
----------------------------------+---------------------+---------------------
activity:user:990192934408163330 | 1000054880351555598 | 1000054880351555598
activity:user:990192934408163330 | 1000054884671688706 | 1000054884671688706
activity:user:990192934408163330 | 1000054888966656017 | 1000054888966656017
activity:user:990192934408163330 | 1000054893257429005 | 1000054893257429005
activity:user:990192934408163330 | 1000054897552396308 | 1000054897552396308
activity:user:990192934408163330 | 1000054901843169290 | 1000054901843169290
activity:user:990192934408163330 | 1000054906138136577 | 1000054906138136577
activity:user:990192934408163330 | 1000054910433103883 | 1000054910433103883
activity:user:990192934408163330 | 1000054914723876869 | 1000054914723876869
activity:user:990192934408163330 | 1000054919010455568 | 1000054919010455568
CREATE TABLE minds.entities_by_time (
key text,
column1 text,
value text,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'enabled': 'false'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.1
AND speculative_retry = '99PERCENTILE';
通过查询,发现在Cassandra中,1007227353832624141小于963426376394739730。为什么?
Through inquiry, it is found that in Cassandra, 1007227353832624141 is less than 963426376394739730. Why?
推荐答案
克里斯,打个招呼!表定义说明了一切!我重新创建了表并运行了双向查询:
Good call Chris! The table definition tells it all! I recreated your table and ran queries sorting in both directions:
flynn@cqlsh:stackoverflow> SELECT * FROM entities_by_time
WHERE key='activity:user:990192934408163330' ORDER BY column1 DESC;
key | column1 | value
----------------------------------+---------------------+---------------------
activity:user:990192934408163330 | 999979571363188746 | 999979571363188746
activity:user:990192934408163330 | 999979567064027139 | 999979567064027139
activity:user:990192934408163330 | 963426376394739730 | 963426376394739730
activity:user:990192934408163330 | 1007227353832624141 | 1007227353832624141
activity:user:990192934408163330 | 1000054884671688706 | 1000054884671688706
activity:user:990192934408163330 | 1000054880351555598 | 1000054880351555598
(6 rows)
flynn@cqlsh:stackoverflow> SELECT * FROM entities_by_time
WHERE key='activity:user:990192934408163330' ORDER BY column1 ASC;
key | column1 | value
----------------------------------+---------------------+---------------------
activity:user:990192934408163330 | 1000054880351555598 | 1000054880351555598
activity:user:990192934408163330 | 1000054884671688706 | 1000054884671688706
activity:user:990192934408163330 | 1007227353832624141 | 1007227353832624141
activity:user:990192934408163330 | 963426376394739730 | 963426376394739730
activity:user:990192934408163330 | 999979567064027139 | 999979567064027139
activity:user:990192934408163330 | 999979571363188746 | 999979571363188746
(6 rows)
对您的问题...
在Cassandra中,1007227353832624141小于963426376394739730。为什么?
in Cassandra, 1007227353832624141 is less than 963426376394739730. Why?
简单地说,因为9> 1,这就是原因。
您的表定义位于列1上
,它是TEXT / UTF8字符串,不是数字。本质上,Cassandra用唯一知道的方式对字符串进行排序-以ASCII字母顺序(不是不是字母数字顺序)。
Your table definition clusters on column1
, which is a TEXT/UTF8 string and not a numeric. Essentially, Cassandra is sorting strings the only way it knows how - in ASCII-betical order, which is not alpha-numeric order.
存储您的数字作为数字,排序将以更可预测的方式表现。
Store your numerics as numerics, and sorting will behave in ways that are more predictable.
这篇关于cassandra:排序问题,排序错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!