一个能最有效地查询用户关注者/关注者的好的MongoDB文档结构是什么? [英] What is a good MongoDB document structure for most efficient querying of user followers/followees?

查看:312
本文介绍了一个能最有效地查询用户关注者/关注者的好的MongoDB文档结构是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直想知道理想的文档结构,以便在各种情况下获得最大的查询效率,我想问一个问题.我真的不知道在这种特定情况下MongoDB在内存中的行为真的使我感到不安.让我给你一个假设的情况.

I've been wondering about the ideal document structure for maximum query efficiency for various situations and there's one I want to ask about. It's really borne out of me not really knowing how MongoDB behaves in memory in this specific kind of case. Let me give you a hypothetical scenario.

想象一下一个Twitter风格的追随者和追随者系统.粗略地看了一眼之后,主要选项似乎是:

Imagine a Twitter-style system of Followers and Followees. After an admittedly cursory glance, the main options appear to be:

  1. 在每个用户文档中,都是一个跟随者"数组,其中包含对他们关注的其他用户的所有文档的引用.通过在其他用户的"user.followers"数组中找到当前用户来找到关注者.主要缺点是追随者"搜索的潜在查询开销.另外,对于专门针对"user.followers"内容的查询,MongoDB只是访问用户文档中的必填字段,还是找到整个用户文档,然后从那里查找必填字段值,并将其缓存/以这样的方式存储:在庞大的用户群上进行查询将需要更多的内存?

  1. In each user document, a "followers" array containing references to all the documents of other users they follow. Followees are found by finding our current user in other users' "user.followers" array. The main downside would appear to be the potential query overhead of the Followee search. Also, for a query specifically for the contents of "user.followers", does MongoDB just access the required field in users' documents, or is the whole user document found and then the required field values looked up from there and is this cached/stored in such a way that a query over a large user base would require significantly more memory?

在每个用户文档中,都存储关注者"和关注者",以便更快地访问它们.这显然具有重复数据的缺点,因为在相应字段中的两个用户文档中都存在用户A跟随用户B的条目,并且从中删除需要在另一个字段中进行匹配删除.从技术上讲,这可能是考虑将潜在故障点的数量加倍,以便进行简单删除.当发生删除时,MongoDB是否仍然遭受我所描述的内存存储数据的混乱"困扰,因此从2个字段中删除而不是从1个字段中删除会使该内存漏洞问题的影响加倍吗?

In each user document, storing both "followers" and "followees" for quicker access to each. This obviously has the downside of duplicate data in the sense that an entry for user A following user B exists in both user documents in the respective field, and deletion from from requires a matching deletion in the other. Technically, this could be considering doubling number of points of potential failure for a simple deletion. And does MongoDB still suffer from what I've heard described as "swiss cheesing" of it's memory-stored data when deletions occur, and so removals from the 2 fields rather than 1 doubles the effect of that memory hole problem?

一个用于存储用户关注者的单独集合,其查询方式类似于1-中的用户文档,只是显然唯一的访问数据是关注者,因此,如果用户文档中包含很多其他相关数据对于每个用户,我们避免访问该数据.虽然这似乎具有某种关系数据库的感觉,尽管我知道从原则上讲这并不总是一种糟糕的方法,但是显然,如果提到的其他方法之一(或我没有考虑过)在Mongo的体系结构下更好,我很想学习!

A separate collection for storing users' Followers, queried in a similar fashion to the user documents in 1- except that obviously the only data being accessed is Followers so if the user documents contain quite a lot of other data relevant to each user, we avoid accessing that data. This seems to have something of a relational database feel to it though and while I know that's not always a terrible approach just on principle, obviously if one of the other approaches mentioned (or one I haven't considered) is better under Mongo's architecture I'd love to learn!

如果有人对此有任何想法,或者想告诉我我在某个地方错过了一个非常相关且显而易见的文档页面,或者甚至想告诉我我只是愚蠢(想一想为什么,请;)))我很想听听您的消息!

If anyone has any thoughts on this, or wants to tell me I've missed a very relevant and and obvious docs page somewhere, or even wants to tell me that I'm just being stupid (thought with an explanation of why, please ;) ) I'd love to hear from you!

推荐答案

这是一个经典的追随者问题,没有答案.请查看此链接:

关注和订阅源的mongo db设计,应在何处我嵌入了吗?

实际上,如果只有MongoDB和SQL Server是您的选择,那么这种情况非常适合于关系模式.但这是一种特殊的关系问题,其中您具有双向关系.也许可以通过图形数据库更好地处理:

http://forum.kohanaframework. org/discussion/10130/followers-and-following-database-design-like-twitter/p1

事实是,您可以将关注者或关注者保留在用户文档中,但不能同时保留两者,以避免双重删除问题.因此,如果您必须坚持使用MongoDB,则可能是一种出路..(假设人们不经常关注/取消关注那个的任何人),

仅将关注者保留在文档中,因为当我查看自己的个人资料时,我会对关注的人感兴趣.(这就是我首先关注他们的原因,对吧?).然后执行类似的查询:

db.Users.find({ user_id : { $in : followees })

这将告诉谁所有人都在关注我(假设我的ID为"user_id").

我不建议相反的另一个原因是:.一个人最多只能追踪30-40个人,因此与存储数千个追随者的用户文档相比,存储30-40个追随者的用户文档应该可以!使用文档中的跟随者"方法,您将在整个过程中获得大约大小均匀的用户文档.在文档中的跟随者"方法中,您还将拥有一些非常小但非常庞大的文档.根据您输入的关注者数据的数量(除了follower_id之外,如果有的话),您可能要注意文档的大小限制.

This is a classic follower-followee problem and there's no one answer to it..Check out this link:

mongo db design of following and feeds, where should I embed?

Actually this situation lends itself very well to a relational schema, if MongoDB and SQL server were the only choices you had. But this is a special type of relational problem wherein you have a two-way relationship. This can perhaps be better handled by a graph database:

http://forum.kohanaframework.org/discussion/10130/followers-and-following-database-design-like-twitter/p1

The thing is, you could either keep followers or followees in a User document, but not both, for avoiding double deletion issues. So if you must stick to MongoDB, one way out could be..(assuming people don't follow/unfollow anyone that frequently),

Keep just the followees in the document, because when I view my profile, I'd be interested in the people I follow.. (that's the reason I followed them in the first place, right?)..And then do a query like:

db.Users.find({ user_id : { $in : followees })

This will tell who all are following me (say my id is 'user_id').

Another reason why I don't suggest the other way round is that.. one may follow at the most 30-40 people, so User document storing 30-40 followees should be okay as against a User document storing thousands of followers! With the followee-in-document approach, you get an roughly even sized User documents throughout..In the follower-in-document approach, you will have some very small but some very bulky documents as well. And depending upon the amount of follower-data you put in (if any, apart from follower_id), you might want to be careful about the document size limit.

这篇关于一个能最有效地查询用户关注者/关注者的好的MongoDB文档结构是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆