MongoDB 查询注释以及用户信息 [英] MongoDB query comments along with user information

查看:45
本文介绍了MongoDB 查询注释以及用户信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 nodejs 和 mongod(不是 mongoose)创建一个应用程序.我有一个问题让我头疼了好几天,任何人都请为此提出一种方法!!我有一个这样的 mongodb 设计

I am creating an application with nodejs and mongod(Not mongoose). I have a problem that gave me headache over few days, anyone please suggest a way for this!!. I have a mongodb design like this

post{
  _id:ObjectId(...),
  picture: 'some_url',
  comments:[
    {_id:ObjectId(...),
     user_id:Object('123456'),
     body:"some content"
    },
    {_id:ObjectId(...),
     user_id:Object('...'),
     body:"other content"
    } 
  ]
}

user{
 _id:ObjectId('123456'),
 name: 'some name', --> changable at any times
 username: 'some_name', --> changable at any times
 picture: 'url_link' --> changable at any times
}

我想查询帖子以及所有用户信息,因此查询将如下所示:

I want to query the post along with all the user information so the query will look like this:

[{
  _id:ObjectId(...),
  picture: 'some_url',
  comments:[
    {_id:ObjectId(...),
     user_id:Object('123456'),
     user_data:{
         _id:ObjectId('123456'),
         name: 'some name',
         username: 'some_name',
         picture: 'url_link'
     }
     body:"some content"
    },
    {_id:ObjectId(...),
     user_id:Object('...'),
     body:"other content"
    } 
  ]
}]

我尝试使用循环手动获取用户数据并添加到评论中,但事实证明这很困难,而且我的编码技能无法实现:(

I tried to use loop to manually get the user data and add to comment but it proves to be difficult and not achievable by my coding skill :(

请任何人提出任何建议,我将不胜感激.

Please anybody got any suggestion, I would be really appreciated.

P/s 我正在尝试另一种方法,我会将所有用户数据嵌入到评论中,并且每当用户更新他们的用户名、姓名或图片时.他们也会在所有评论中更新它

P/s I am trying another approach that I would embedded all the user data in to the comment and whenever the user update their username, name or picture. They will update it in all the comment as well

推荐答案

问题

之前写的,over-embedding有几个问题:

The problem(s)

As written before, there are several problems when over-embedding:

截至撰写本文时,BSON 文档有限到 16MB.如果达到该限制,MongoDB 将抛出异常,您将无法添加更多注释,在最坏的情况下,如果更改会增加文档大小,甚至不会更改(用户名)或图片.

As of the time of this writing, BSON documents are limited to 16MB. If that limit is reached, MongoDB would throw an exception and you simply could not add more comments and in worst case scenarios not even change the (user-)name or the picture if the change would increase the size of the document.

在某些条件下查询或排序评论数组并不容易.有些事情需要相当昂贵的聚合,有些则需要相当复杂的语句.

It is not easily possible to query or sort the comments array under certain conditions. Some things would require a rather costly aggregation, others rather complicated statements.

虽然有人可能会争辩说,一旦查询到位,这不是什么大问题,但我不同意.首先,查询越复杂,对于开发人员和随后的 MongoDB 查询优化器来说,优化就越困难.我通过简化数据模型和查询获得了最好的结果,在一个实例中将响应速度提高了 100 倍.

While one could argue that once the queries are in place, this isn't much of a problem, I beg to differ. First, the more complicated a query is, the harder it is to optimize, both for the developer and subsequently MongoDBs query optimizer. I have had the best results with simplyfying data models and queries, speeding up responses by a factor of 100 in one instance.

在扩展时,与更简单的数据模型和相应查询相比,复杂和/或成本高昂的查询所需的资源甚至可能总计为整台机器.

When scaling, the ressources needed for complicated and/or costly queries might even sum up to whole machines when compared to a simpler data model and according queries.

最后但并非最不重要的一点是,您可能会在维护代码时遇到问题.作为一个简单的经验法则

Last but not least you might well run into problems maintaining your code. As a simple rule of thumb

你的代码越复杂,就越难维护.代码越难维护,维护代码所需的时间就越多.维护代码所需的时间越多,成本就越高.

The more complicated your code becomes, the harder it is to maintain. The harder code is to maintain, the more time it needs to maintain the code. The more time it needs to maintain code, the more expensive it gets.

结论:复杂的代码代价高昂.

Conclusion: Complicated code is expensive.

在这种情况下,昂贵"既指金钱(专业项目),也指时间(业余项目).

In this context, "expensive" both refers to money (for professional projects) and time (for hobby projects).

这很简单:简化您的数据模型.因此,您的查询将变得不那么复杂并且(希望)更快.

It is pretty easy: simplify your data model. Consequently, your queries will become less complicated and (hopefully) faster.

这对我来说将是一个疯狂的猜测,但这里重要的是向您展示一般方法.我将您的用例定义如下:

That's going to be a wild guess for me, but the important thing here is to show you the general method. I'd define your use cases as follows:

  1. 对于给定的帖子,用户应该能够发表评论
  2. 对于给定的帖子,显示作者和评论,以及评论者和作者的用户名及其图片
  3. 对于给定的用户,应该可以轻松更改名称、用户名和图片

第 2 步:相应地对数据建模

用户

首先,我们有一个简单的用户模型

Step 2: Model your data accordingly

Users

First of all, we have a straightforward user model

{
  _id: new ObjectId(),
  name: "Joe Average",
  username: "HotGrrrl96",
  picture: "some_link"
}

这里没有任何新内容,只是为了完整性而添加.

Nothing new here, added just for completeness.

{
  _id: new ObjectId()
  title: "A post",
  content: " Interesting stuff",
  picture: "some_link",
  created: new ISODate(),
  author: {
    username: "HotGrrrl96",
    picture: "some_link"
  }
}

这就是一篇帖子的内容.这里有两件事要注意:首先,我们存储显示帖子时我们立即需要的作者数据,因为这为我们节省了一个非常常见的查询,如果不是无处不在的用例.为什么我们不相应地保存评论和评论者数据?由于 16 MB 大小限制,我们正在尝试以防止将参考文献存储在单个文档中.相反,我们将引用存储在评论文档中:

And that's about it for a post. There are two things to note here: first, we store the author data we immediately need when displaying a post, since this saves us a query for a very common, if not ubiquitous use case. Why don't we save the comments and commenters data acordingly? Because of the 16 MB size limit, we are trying to prevent the storage of references in a single document. Rather, we store the references in comment documents:

{
  _id: new ObjectId(),
  post: someObjectId,
  created: new ISODate(),
  commenter: {
    username: "FooBar",
    picture: "some_link"
  },
  comment: "Awesome!"
}

与帖子一样,我们拥有显示帖子所需的所有数据.

The same as with posts, we have all the necessary data for displaying a post.

我们现在实现的是绕过了 BSON 大小限制,我们不需要参考用户数据来显示帖子和评论,这应该为我们节省了很多查询.但是让我们回到用例和更多的查询

What we have achieved now is that we circumvented the BSON size limit and we don't need to refer to the user data in order to be able to display posts and comments, which should save us a lot of queries. But let's come back to the use cases and some more queries

现在完全简单了.

所有评论

db.comments.find({post:objectIdOfPost})

关于 3 条最新评论

db.comments.find({post:objectIdOfPost}).sort({created:-1}).limit(3)

因此,为了显示帖子及其所有(或部分)评论,包括用户名和图片,我们进行了两次查询.比您以前需要的要多,但我们绕过了大小限制,基本上您可以对每个帖子发表无限数量的评论.但让我们来点真实的

So for displaying a post and all (or some) of its comments including the usernames and pictures we are at two queries. More than you needed before, but we circumvented the size limit and basically you can have an indefinite number of comments for every post. But let's get to something real

这是一个两步过程.但是,通过适当的索引(稍后会提到),这仍然应该很快(因此可以节省资源):

This is a two step process. However, with proper indexing (will come back to that later) this still should be fast (and hence resource saving):

var posts = db.posts.find().sort({created:-1}).limit(5)
posts.forEach(
  function(post) {
    doSomethingWith(post);
    var comments = db.comments.find({"post":post._id}).sort("created":-1).limit(3);
    doSomethingElseWith(comments);
  }
)

获取给定用户的所有帖子(从最新到最旧排序)及其评论

var posts = db.posts.find({"author.username": "HotGrrrl96"},{_id:1}).sort({"created":-1});
var postIds = [];
posts.forEach(
  function(post){
    postIds.push(post._id);
  }
)
var comments = db.comments.find({post: {$in: postIds}}).sort({post:1, created:-1});

请注意,我们这里只有两个查询.尽管您需要手动"在帖子及其各自的评论之间建立联系,但这应该非常简单.

Note that we have only two queries here. Although you need to "manually" make the connection between posts and their respective comments, that should be pretty straightforward.

这大概是一个罕见的用例.但是,上述数据模型并不是很复杂

This presumably is a rare use case executed. However, it isn't very complicated with said data model

首先,我们更改用户文档

First, we change the user document

db.users.update(
  { username: "HotGrrrl96"},
  {
    $set: { username: "Joe Cool"},
    $push: {oldUsernames: "HotGrrrl96" }
  },
  {
    writeConcern: {w: "majority"}
  }
);

我们将旧用户名推送到相应的数组.这是一种安全措施,以防以下操作出现问题.此外,我们将写关注设置为相当高的级别,以确保数据的持久性.

We push the old username to an according array. This is a security measure in case something goes wrong with the following operations. Furthermore, we set the write concern to a rather high level in order to make sure the data is durable.

db.posts.update(
  { "author.username": "HotGrrrl96"},
  { $set:{ "author.username": "Joe Cool"} },
  {
    multi:true,
    writeConcern: {w:"majority"}
  }
)

这里没什么特别的.评论的更新语句看起来几乎相同.虽然这些查询需要一些时间,但它们很少被执行.

Nothing special here. The update statement for the comments looks pretty much the same. While those queries take some time, they are rarely executed.

根据经验,可以说 MongoDB 每个查询只能使用一个索引.虽然这并不完全正确,因为存在索引交叉,但很容易处理.另一件事是复合索引中的各个字段可以独立使用.因此,索引优化的一种简单方法是找到在使用索引的操作中使用最多字段的查询,并创建它们的复合索引.请注意,查询中出现的顺序很重要.那么,让我们继续吧.

As a rule of thumb, one can say that MongoDB can only use one index per query. While this is not entirely true since there are index intersections, it is easy to deal with. Another thing is that individual fields in a compound index can be used independently. So an easy approach to index optimization is to find the query with the most fields used in operations which make use of indices and create a compound index of them. Note that the order of occurrence in the query matters. So, let's go ahead.

db.posts.createIndex({"author.username":1,"created":-1})

评论

db.comments.createIndex({"post":1, "created":-1})

结论

诚然,每个帖子完全嵌入的文档是加载它及其评论的最快方式.但是,它不能很好地扩展,并且由于可能需要处理它的复杂查询的性质,因此可能会利用甚至消除这种性能优势.

Conclusion

A fully embedded document per post admittedly is the the fastest way of loading it and it's comments. However, it does not scale well and due to the nature of possibly complex queries necessary to deal with it, this performance advantage may be leveraged or even eliminated.

使用上述解决方案,您可以用一些速度(如果!)与基本上无限的可扩展性和更直接的数据处理方式进行交换.

With the above solution, you trade some speed (if!) against basically unlimited scalability and a much more straightforward way of dealing with the data.

嗯.

这篇关于MongoDB 查询注释以及用户信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆