Cassandra时间数据模型 [英] Cassandra timeseries datamodel

查看：275 发布时间：2016/11/13 14:49:30 nosql cassandra datamodel phpcassa cassandra-jdbc

本文介绍了Cassandra时间数据模型的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设有10个设备（dev01，dev02，dev03..etc）。

它以一定的时间间隔发送数据，我们收集这些数据，

  dev01：int 
 signalname：string 
 signaltime：date / time [with YY-MM- DD HHMMSS.mm] 
 Extradata：String

我想将数据推送到cassandra最好存储这些数据吗？

我的查询是，

1需要检索基于设备的电流日数据或某个日期范围？

2 5设备当前日期数据？

以下将数据存储到cassadra中的方式是最好的模型
标准columnfamily名称：signalname 行键：dev01 columnname：timeseries（20120801124204）[YYMMDD HHMMSS] columnvalue：Json data columnname：timeseries（20120801124205）[YYMMDD HHMMSS] [next second data] columnvalue：Json data row key：dev02 columnname：timeseries（20120801124204）[YYMMDD HHMMSS] columnvalue：Json data columnname：timeseries（20120801124205）[YYMMDD HHMMSS] [next second data] columnvalue：Json data 或超级列系列：信号名行键：Clientid1 超列名：dev01 columnname：timeseries（20120801124204）[YYMMDD HHMMSS] columnvalue：Json data supercolumnname：dev02 columnname：timeseries（20120801124204）[YYMMDD HHMMSS] columnvalue ：Json data row key：Clientid2 supercolumnname：dev03 columnname：timeseries（20120801124204）[YYMMDD HHMMSS] columnvalue： Json数据超列名：dev04 columnname：timeseries（20120801124204）[YYMMDD HHMMSS] columnvalue：Json data
请帮助我解决这个问题，
任何其他方式？

谢谢& b $ b Kannadhasan
解决方案
我在这里看到3个问题，我将在下面说明：

超级列族，

thrift与cql3，

。

在开始之前：不建议使用超级列族。在此处了解详情。

此外，您可能需要阅读CQL3 ，因为 thrift是一个遗留API 。

您可以使用本地集合数据类型，如列表和地图等。如果您仍想使用JSON，请使用

一般来说，在每个设备和每个时间段查询是非常简单的：

您的行键将是设备ID和列键a timeuuid

为避免热点，添加bucket计数器到行键（创建复合行/分区键）以旋转节点

然后，如果知道行/设备ID，您可以查询时间范围。

或者，如果要查询数据，您可以使用信号类型作为行键（和timeuuid / timestamp作为列键）为多个设备（但一个事件类型）。有关详情，请参阅这篇文章中的cassandra中的时间序列数据博客条目。

希望有所帮助！

Let assume 10 devices(dev01,dev02,dev03..etc).

It send data with some interval time,we collect those data,so our data schema is
dev01 :int signalname :string signaltime :date/time[with YY-MM-DD HHMMSS.mm] Extradata :String
I want to push data into cassandra ,which way is best to store those data?

My Query is Like ,

1 Need to retrive device based current day data,or with some date range?

2 5 Device current day data?

I am not sure the following way to store data into cassadra is best model
Standard columnfamily Name:signalname row key :dev01 columnname :timeseries(20120801124204)[YYMMDD HHMMSS] columnvalue :Json data columnname :timeseries(20120801124205)[YYMMDD HHMMSS][next second data] columnvalue :Json data row key :dev02 columnname :timeseries(20120801124204)[YYMMDD HHMMSS] columnvalue :Json data columnname :timeseries(20120801124205)[YYMMDD HHMMSS][next second data] columnvalue :Json data Or Super columnfamily :signalname row key :Clientid1 supercolumnname :dev01 columnname :timeseries(20120801124204)[YYMMDD HHMMSS] columnvalue :Json data supercolumnname :dev02 columnname :timeseries(20120801124204)[YYMMDD HHMMSS] columnvalue :Json data row key :Clientid2 supercolumnname :dev03 columnname :timeseries(20120801124204)[YYMMDD HHMMSS] columnvalue :Json data supercolumnname :dev04 columnname :timeseries(20120801124204)[YYMMDD HHMMSS] columnvalue :Json data
kindly help me out regarding this issue, Any other Way?

Thanks&Regards, Kannadhasan
解决方案
I see 3 issues with your approach here which I will address below:

super column families,

thrift vs cql3,

json data as cell values.

Before you go ahead: the use super column families is discouraged. Read more here. Composite keys (as described below) are the way to go.

Also, you might need to read up on CQL3, since thrift is a legacy API since 1.2.

Instead of storing json data, you may make use of native collection data types like lists, and maps etc. If you still want to work with JSON, there is improved JSON support in in Cassandra since version 2.2.

In general, it is pretty straightforward to query per device and per timeperiod:

you row key would be the device id and the column key a timeuuid

To avoid hot spots, you could add "bucket" counters to the row key (create a composite row/partition key) to rotate the nodes

You can then query for time ranges if you know the row/device id.

Alternatively you could use your signal type as a row key (and timeuuid/timestamp as a column key) if you want to query data for multiple devices (but one event type) at once. Read more on timeseries data in cassandra in this blog entry.

Hope that helps!

这篇关于Cassandra时间数据模型的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Cassandra时间数据模型 [英] Cassandra timeseries datamodel

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Cassandra时间数据模型 [英] Cassandra timeseries datamodel

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭