非常规时间序列的Mongo [英] Mongo for non regular time-series

查看：86 发布时间：2020/5/11 1:53:22 mongodb time-series scalability

本文介绍了非常规时间序列的Mongo的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用MongoDB来处理时间序列，这工作正常，因为直到现在为止没有太多数据，但是现在我需要确定扩展到大量数据所需的内容.如今，每天接收的数据超过20万，每秒钟接收的数据就不多了，但是很快就会增加.

I'm using MongoDB to handle timeseries, this is working fine as until now there is not too many data but I now need to identify what is needed to scale to a larger number of data. Today, there are +200k data received per day, each data received every couple of seconds, that is not huge but this should increase soon.

由于每条数据(parentID，时间戳，值)创建一个文档，因此所使用的数据收集效率远非如此.我已经看到了几种使用文档的方法，该文档可以将时间序列保持整整一个小时(例如，内部数组可以保持每一秒的数据)，这确实很棒，但是由于未收到我必须处理的数据定期(取决于parentID)，这种方法可能不合适.

The data collection used is far from beeing efficient as each piece of data (parentID, timestamp, value) creates a document. I've seen several approaches that uses a document that keeps the timeseries for a whole hour (with, for instance, an inner array that keeps data for each seconds), this is really great but as the data I have to handle are not received regularly (depending upon the parentID), this approach might not be appropriate.

在我收到的数据中:
-每隔几秒钟就会收到一些
-每隔几分钟就会收到一些
对于所有这些数据，两个连续数据之间的步长不一定相同.

Among the data I receive:
- some are received every couple of seconds
- some are received every couple of minutes
For all those data, the step between 2 consecutive ones is not necessarily the same.

我是否可以使用更好的方法来处理这些数据(例如使用其他建模方法)来帮助扩展数据库?

Is there a better approach I could use to handle those data, for instance using another modelisation, that could help to scale the DB ?

今天只有一个mongod进程在运行，我想知道可能真的需要分片，对此有什么建议吗?

Today only one mongod process is running, and I'm wondering at which level the sharding might really be needed, any tips for this ?

推荐答案

即使读数不是均匀分布的，您仍然可以从拥有预分配的文档中获益.您无法按阅读时间来整理每个文档，但是可以整理每个文档来保存固定数量的阅读内容

You may still be able to reap the benefit of having a preallocated document even if readings aren't uniformly distributed. You can't structure each document by the time of the readings, but you can structure each document to hold a fixed number of readings

{
    "type" : "cookies consumed"
    "0" : { "number" : 1, "timestamp" : ISODate("2015-02-09T19:00:20.309Z") },
    "1" : { "number" : 4, "timestamp" : ISODate("2015-02-09T19:03:25.874Z") },
    ...
    "1000" : { "number" : 0, "timestamp" : ISODate("2015-01-01T00:00:00Z") }
}

根据您的用例，此结构可能对您有用，并为您带来了用新的读数更新预分配文档的好处，只为每个N读数为一个较大的N分配一个全新的文档.

Depending on your use case, this structure might work for you and give you the benefit of updating preallocated documents with new readings, only allocating a brand new document every N readings for some big N.

这篇关于非常规时间序列的Mongo的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

非常规时间序列的Mongo [英] Mongo for non regular time-series

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

非常规时间序列的Mongo [英] Mongo for non regular time-series

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭