非常规时间序列的Mongo [英] Mongo for non regular time-series
问题描述
我正在使用MongoDB来处理时间序列,这工作正常,因为直到现在为止没有太多数据,但是现在我需要确定扩展到大量数据所需的内容.如今,每天接收的数据超过20万,每秒钟接收的数据就不多了,但是很快就会增加.
I'm using MongoDB to handle timeseries, this is working fine as until now there is not too many data but I now need to identify what is needed to scale to a larger number of data. Today, there are +200k data received per day, each data received every couple of seconds, that is not huge but this should increase soon.
由于每条数据(parentID,时间戳,值)创建一个文档,因此所使用的数据收集效率远非如此.我已经看到了几种使用文档的方法,该文档可以将时间序列保持整整一个小时(例如,内部数组可以保持每一秒的数据),这确实很棒,但是由于未收到我必须处理的数据定期(取决于parentID),这种方法可能不合适.
The data collection used is far from beeing efficient as each piece of data (parentID, timestamp, value) creates a document. I've seen several approaches that uses a document that keeps the timeseries for a whole hour (with, for instance, an inner array that keeps data for each seconds), this is really great but as the data I have to handle are not received regularly (depending upon the parentID), this approach might not be appropriate.
在我收到的数据中:
-每隔几秒钟就会收到一些
-每隔几分钟就会收到一些
对于所有这些数据,两个连续数据之间的步长不一定相同.
Among the data I receive:
- some are received every couple of seconds
- some are received every couple of minutes
For all those data, the step between 2 consecutive ones is not necessarily the same.
我是否可以使用更好的方法来处理这些数据(例如使用其他建模方法)来帮助扩展数据库?
Is there a better approach I could use to handle those data, for instance using another modelisation, that could help to scale the DB ?
今天只有一个mongod进程在运行,我想知道可能真的需要分片,对此有什么建议吗?
Today only one mongod process is running, and I'm wondering at which level the sharding might really be needed, any tips for this ?
推荐答案
即使读数不是均匀分布的,您仍然可以从拥有预分配的文档中获益.您无法按阅读时间来整理每个文档,但是可以整理每个文档来保存固定数量的阅读内容
You may still be able to reap the benefit of having a preallocated document even if readings aren't uniformly distributed. You can't structure each document by the time of the readings, but you can structure each document to hold a fixed number of readings
{
"type" : "cookies consumed"
"0" : { "number" : 1, "timestamp" : ISODate("2015-02-09T19:00:20.309Z") },
"1" : { "number" : 4, "timestamp" : ISODate("2015-02-09T19:03:25.874Z") },
...
"1000" : { "number" : 0, "timestamp" : ISODate("2015-01-01T00:00:00Z") }
}
根据您的用例,此结构可能对您有用,并为您带来了用新的读数更新预分配文档的好处,只为每个N
读数为一个较大的N
分配一个全新的文档.
Depending on your use case, this structure might work for you and give you the benefit of updating preallocated documents with new readings, only allocating a brand new document every N
readings for some big N
.
这篇关于非常规时间序列的Mongo的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!