如何按Cassandra中的最后更新日期对数据进行排序? [英] How do I sort data by the last update date in Cassandra?

查看:139
本文介绍了如何按Cassandra中的最后更新日期对数据进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要建议以正确设计Cassandra中的表。我需要得到所有书籍的分类清单。排序是按上次更新的日期进行的。每次购买特定书籍时, number_of_buyers 列都会更新。另外,我需要更新 updated_at 列的值。问题是 updated_at 列是集群键,它是主键的一部分。我们无法更新作为主键一部分的列中的值。

I need advice to correctly design the table in Cassandra. I need to get a sorted list of all the books. Sorting is performed by the date of the last update. Each time a particular book is purchased, the number_of_buyers column is updated. Also, I need to update the value of the updated_at column. The problem is the updated_at column is the clustering key which is the part of the primary key. We can't update values in columns that are part of the primary key.

create table books (
   book_id uuid,
   created_at timestamp,
   updated_at timestamp,
   book_name varchar,
   book_author varchar,
   number_of_buyers int,
   primary key (book_id, updated_at)
) with clustering order by (updated_at desc);

另一个示例:

create table chat_rooms (
   chat_room_id uuid,
   created_at timestamp,
   updated_at timestamp,
   last_message_content varchar,
   last_message_author varchar,
   unread_messages_number int,
   primary key (chat_room_id, updated_at)
) with clustering order by (updated_at desc);

每个聊天室都有最新消息。这些信息总是在变化。如果发生变化,我想将聊天室放在列表的顶部。

Each chat room has the latest message. This information is always changing. In cases of change, I want to put the chat room at the top of the list. Classic behavior in many messengers.

推荐答案

您将需要对其他内容进行分区。诀窍是要在查询灵活性之间找到适当的平衡(这是您明显的需求),同时避免无限的分区增长。

So for sure; you are going to need to partition on something different. The trick is going to be finding the right balance of query flexibility (your obvious need here) while avoiding unbound partition growth.

对于图书表,是否可以对类别之类的东西进行分区?您知道吗,例如恐怖片,幻想片,图画小说,非小说类片,教学片等等??

For the books table, is it possible to partition on something like category? You know, like horror, fantasy, graphic novel, non-fiction, instructional, etc..?

CREATE TABLE book_events (
   book_id uuid,
   created_at timestamp,
   updated_at timestamp,
   book_name varchar,
   book_author varchar,
   number_of_buyers int,
   category text,
   PRIMARY KEY (category, book_name, updated_at, book_id)
) WITH CLUSTERING ORDER BY (book_name ASC,updated_at DESC,book_id ASC);

对于主键定义,我们可以在类别上进行分区,然后群集在 book_name updated_at 上,并在 book_id 上结束(为了唯一性)。然后,为每个销售事件 INSERT 新建一行。在查询中(插入几行之后),在使用<$ c $时,对 updated_at 使用 MAX 聚合在 book_name 上使用c> GROUP BY 子句。

For the PRIMARY KEY definition, we can partition on category, and then cluster on book_name and updated_at, with book_id on the end (for uniqueness). Then, INSERT a new row for each sale event. On the query (after inserting a few rows), use the MAX aggregation on updated_at while using the GROUP BY clause on book_name.

SELECT book_name,book_author,number_of_buyers,MAX(updated_at) FROm book_events 
 WHERE category='Computers & Technology' GROUP BY book_name;

 book_name                       | book_author                                                | number_of_buyers | system.max(updated_at)
---------------------------------+------------------------------------------------------------+------------------+---------------------------------
  Mastering Apache Cassandra 3.x |                                Aaron Ploetz, Teja Malepati |               52 | 2020-10-05 14:29:33.134000+0000
 Seven NoSQL Databases in a Week | Aaron Ploetz, Devram Kandhare, Brian Wu, Sudarshan Kadambi |              163 | 2020-10-05 14:29:33.142000+0000

(2 rows)

唯一要考虑的是如何处理废弃的销售行。当然,您可以随时删除它们,具体取决于写入频率。最最佳的解决方案是考虑销售节奏,并应用TTL。

The only other consideration, is what to do with the obsoleted sale rows. You could delete them as you go, depending on the write frequency, of course. The most-optimal solution would be to consider the cadence of sales, and apply a TTL.

该解决方案绝对不能按原样完成,但我希望它能引导您朝着正确的方向发展

This solution is definitely not complete as-is, but I hope it leads you in the proper direction.

这篇关于如何按Cassandra中的最后更新日期对数据进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆