mongodb 中日志分析数据库的最佳架构设计 [英] best possible schema design for log analysis database in mongodb

查看:53
本文介绍了mongodb 中日志分析数据库的最佳架构设计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须将以下数据存储在 mongodb uid、gender、country、city、date_of_visit、url_of_visit

i have to store the following data in mongodb uid, gender ,country, city, date_of_visit, url_of_visit

我想将 uid、性别、国家和城市存储在一个集合中,因为这些信息对于特定用户永远不会改变.

I would like to store uid, gender, country and city in one collection because these information will never change for particular user.

在另一个集合中,我想存储 uid、date_of_visit、url_of_visit

in the other collection i would like to store uid, date_of_visit, url_of_visit

我想知道存储 uid、date_of_visit 和 url_of_visit 的最佳做法是什么.我想到了两件事..

i want to know which is best practice to store uid, date_of_visit and url_of_visit.there are two things in my mind..

    (a) { uid: 100, date: xxxxxxxxxxxxxxx, url: abc.php }
        { uid: 100, date: xxxxxx, url: ref.php }
        { uid: 200, date: xxxxxxxxx, url: ref.php } 

    (b) { uid:100, visit:[{date:xxxxxxx, url:abc.php},
                          {date:xxxx, url:def.php},
                          {.........................}]}

我想要以下索引日期:1,uid:1,url:1 ...方法(a)的问题是在数据库中插入的每一行数据库端和索引大小都会增长,并且会出现一个索引大小不适合 RAM 的点

i want to have following index date:1, uid:1 ,url:1 ...the problem with approach (a) is with each row inserted in database the database side and index size will grow and there will come a point when index size will not fit into RAM

方法 (b) 的问题是在某些时候每个文档会超过 16 MB 的限制,而这种方法会在那个时候失败..

problem with approach (b) is at some point each document will exceed the 16 MB limit and this approach will fail that time..

请建议我在这种情况下最好的架构设计应该是什么.我还会有查询,其中包括 uid、gender、country、date_of_visit、url_of_visit

please suggest me what should be the best schema design for this scenario. i would also have the query which include uid, gender, country, date_of_visit, url_of_visit

推荐答案

我知道这个帖子有点老了,但我想知道你是否已经决定了一个结构并且它是否运作良好.

I know this thread is a bit older but I'm wondering if you've decided on a structure and if it works well.

我的想法是,与其冒着创建太大文档的风险,不如像您的第二种方法一样构建它,但将日期包含在主集合中.这样,每个文档都将是用户一天内的活动.它将按用户和日期索引,易于更新和查询并保持井井有条.

My idea was, instead of risking to create too large documents, to structure it similar to your second approach but include the date in the main collection. This way each document would be the user's activity within one day. It would be indexed by user and date, easy to update and query and keep things organized.

类似于:

{ uid:100, date:xxxxxxx, event:[{time:xxxxxxx, url:abc.php},
                                {time:xxxx, url:def.php},
                                {.........................}]}

这篇关于mongodb 中日志分析数据库的最佳架构设计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆