MongoDB收集结构性能 [英] MongoDB Collection Structure Performance

查看:160
本文介绍了MongoDB收集结构性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个MongoDB数据库的半复杂记录,我的报告查询正在努力收集大小增加。我想制作一些针对快速搜索和聚合进行优化的报告视图。以下是一个示例格式:

  var record = {
fieldOne:,
fieldTwo: ,
fieldThree:,//这个级别有大约30个字段
ArrayOne:[
{subItem1:},
{subItem2:} / /在这个数组中通常有大约10-15个项目
],
ArrayTwo:[
{subItem1:},// ArrayTwo项目引用ArrayOne项目ID为
{subItem2:} //这个数组通常有大约20-30个项目
],
ArrayThree:[
{subItem1:},// ArrayThree项目引用了ArrayOne和ArrayTwo项目为ref
{subItem2:},//这个数组中通常有大约200-300个项目
{subArray:[
{subItem1:},
{subItem2:} //这个数组通常有5个项目
]}
]
};

我曾经拥有这个数据,其中 ArrayTwo ArrayOne 项目和 ArrayThree 内部 ArrayTwo 项目内引用父母是隐含的,但报告成为一个具有多个嵌套数组级别的噩梦。



我在每个级别都有一个名为fieldName的字段,这是一种我们目标对象在数组中。



我经常需要从查询中的数千条记录中汇总3个数组中的任意一个的值。



我看到两种方法。



A)。 Flatten and go垂直地,在 ArrayThree 中的每个项目的数据库中创建一个较小的记录,实际上每个复合记录添加200个记录。我尝试了这个,我已经有5万天的新数据进入了200K的记录。这样做的好处是我有可以放置索引的fieldNames。



B )。水平平铺,使每个阵列都在单个集合记录中。我将使用位于每个数组对象中的FieldName作为。这将创造一个200-300个字段的单一记录。这将使收集中的记录减少很多,但这些字段将是动态的,因此添加索引是不可能的(我知道)。



此时我有大约300K的现有记录,我将构建这个View。如果我走到垂直方向,那么将会在数据库中放置6千万个简单的记录,如果我去水平,那么每个200个字段的300K记录将被平坦化,没有索引能力。



什么是正确的方法?

解决方案

我会倾向于坚持使用蒙古哲学,对于每个不同的集合/信息,而不是依赖于一个奇怪的复合对象中的引用。



6000万条记录是很多(但实际上不是一吨),而mongodb喜欢有很多小东西抛在脑后。在另一方面,你会得到更少的大对象,并占用同样多的空间。



(*使用有线老虎后端进行压缩将使您的磁盘进一步)。



* *编辑:
我还补充说,你真的真的希望索引在一天结束,所以这是另一个投票这种方法。


I have a MongoDB database of semi-complex records and my reporting queries are struggling as the collection size increases. I want to make some reporting Views that are optimized for quick searching and aggregating. Here is an sample format:

var record = {
fieldOne:"",
fieldTwo:"",
fieldThree:"", //There is approx 30 fields at this level
ArrayOne:[
    {subItem1:""},
    {subItem2:""} // There are usually about 10-15 items in this array
],
ArrayTwo:[
    {subItem1:""}, //ArrayTwo items reference ArrayOne item ids for ref
    {subItem2:""} // There are usually about 20-30 items in this array
],
ArrayThree:[
    {subItem1:""},// ArrayThree items reference both ArrayOne and ArrayTwo items for ref
    {subItem2:""},// There are usually about 200-300 items in this array
    {subArray:[
        {subItem1:""},
        {subItem2:""} // There are usually about 5 items in this array
    ]} 
]
};

I used to have this data where ArrayTwo was inside ArrayOne items and ArrayThree was inside ArrayTwo items so that referencing a parent was implied, but reporting became a nightmare with multiple nested levels of arrays.

I have a field called 'fieldName' at every level which is a way we target objects in the arrays.

I will often need to aggregate values from any of the 3 arrays across thousands of records in a query.

I see two ways of doing it.

A). Flatten and go Vertically, making a single smaller record in the database for every item in ArrayThree, essentially adding 200 records per single complex record. I tried this and I already have 200K records in 5 days of new data coming in. The benefit to this is that I have fieldNames that I can put indexing on.

B). Flatten Horizontally, making every array flat all within a single collection record. I would use the FieldName located in each array object as the key. This would make a single record with 200-300 fields in it. This would make a lot less records in the collection, but the fields would be dynamic, so adding indexes would not be possible(that I know of).

At this time, I have approx 300K existing records that I would be building this View off of. If I go vertical, that would place 60 Million simple records in the db and if I go Horizontal, it would be 300K records with 200 fields flattened in each with no indexing ability.

What's the right way to approach this?

解决方案

I'd be inclined to stick with the mongo philosophy and do individual entries for each distinct set/piece of information, rather than relying on references within a weird composite object.

60 Million records is "a lot" (but it really isn't "a ton"), and mongodb loves to have lots of little things tossed at it. On the flipside, you'd end up with fewer big objects and take up just as much space.

(*using the wired tiger back end with compression will make your disk go further too).

**edit: I'd also add that you really really really want indexes at the end of the day, so that's another vote for this approach.

这篇关于MongoDB收集结构性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆