Cassandra时间序列数据建模 [英] Cassandra Time-Series data modelling

查看:243
本文介绍了Cassandra时间序列数据建模的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现在我已经在使用分析系统的数据模型了,但我似乎无法为我的主键获得正确的设置。我观看了一大堆视频( https:// www。 youtube.com/watch?v=UP74jC1kM3w&list=PLqcm6qE9lgKJoSWKYWHWhrVupRbS8mmDA&index=9 )了解有关最佳做法(特别是有关时间序列数据)的信息。

I've been working on a data model for an analytics system for a little while now, but I just can't seem to get the right setup for my primary keys. I've watched a bunch of videos (https://www.youtube.com/watch?v=UP74jC1kM3w&list=PLqcm6qE9lgKJoSWKYWHWhrVupRbS8mmDA&index=9) to gain some learnings about best practises, especially concerning time-series data.

关于PRIMARY KEYS,我似乎没有得到正确的平衡,所以我可以查询我需要的方式。

Regarding PRIMARY KEYS, I can't seem to get the right balance so that I can query the way I need to.

这里是我的数据模型:

Here is my data model so far:

CREATE TABLE eventPropertyCountsByDay (
    user_id int,
    event_type varchar,
    property varchar,
    value varchar,
    date_to_day varchar,
    count counter,
    PRIMARY KEY ((event_type, user_id), date_to_day, property, value)
) WITH CLUSTERING ORDER BY (date_to_day DESC, property DESC, value DESC);

我将事件存储在另一个表和此表中的事件属性(列族)中。

I am storing events in another table and event properties in this table (column family).

我需要能够根据用户id查询,用IN查询同时获取多个用户的记录,但我还需要查询属性和值字段,以及指定日期范围。

I need to be able to query according to the user id, with an IN query to get records for multiple users at once, but I also need to query on the property and value fields, as well as specifying a date range.

这是我想要实现的查询示例:

Here is an example of the query I'm trying to achieve:

SELECT * FROM eventPropertyCountsByWeek 
WHERE event_type = 'some_event' 
AND date_to_day > '2014-09-24'
AND user_id IN (123, 456) 
AND property = 'property_name' 
AND value = 'property_value'

如何完成这种查询?

推荐答案

请尝试以下操作:

CREATE TABLE eventPropertyCountsByDay (
  user_id int,
  event_type varchar,
  property varchar,
  value varchar,
  date_to_day int, // day number
  count counter,
  PRIMARY KEY ((event_type, user_id), property, value, date_to_day)
) WITH CLUSTERING ORDER BY (property DESC, value DESC, date_to_day DESC);

我在集群键结束时移动了date_to_day,使其可用于具有固定属性和值的范围查询。

I moved date_to_day at the end of clustering key to make it usable for range queries with fixed property and value.

数据更新查询:

update eventPropertyCountsByDay set count = count + 1 where 
  user_id=1 and 
  event_type='log' and 
  property='prop1' and 
  value='val1' and 
  date_to_day=54321;

选择查询:

select * from eventPropertyCountsByDay 
  where event_type='log' and 
    user_id=1 and 
    property='prop1' and
    value='val1' and 
    date_to_day > 54300;

 event_type | user_id | property | value | date_to_day | count
------------+---------+----------+-------+-------------+-------
        log |       1 |    prop1 |  val1 |       54323 |     2
        log |       1 |    prop1 |  val1 |       54321 |     1

这篇关于Cassandra时间序列数据建模的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆