MongoDB Schema Design-投票 [英] MongoDB Schema Design - Voting on Posts

查看:78
本文介绍了MongoDB Schema Design-投票的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑一下,我有一个网站,上面有很多文章,人们可以对喜欢的文章进行投票.

Consider I have a website where I've got a bunch of articles and people can vote on the articles they like.

我希望能够查询以按投票数排序的特定时间(最后一小时,最后一天,上周)内获得最高票数的文章.

I want to be able to query to get the articles with the most votes within a certain time (last hour, last day, last week) ordered by the number of votes.

与MongoDB一样,有多种方法可以实现此目的,但是我不确定哪种方法正确.

As usual with MongoDB there are several different ways to implement this, but I am not sure which one is correct.

  • 帖子文档,其中包含一组投票-投票本身就是包含用户名,用户名和投票日期的文档:
    {
    "_id": "ObjectId(xxxx)",
    "title": "Post Title",
    "postdate": "21/02/2012+1345",
    "summary": "Summary of Article",

    "Votes": [
        {
            "userid":ObjectId(xxxx),
            "username": "Joe Smith",
            "votedate": "03/03/2012+1436"
        },
            ]
     }

  • 单独的选票集合,其中包含个人选票的详细信息以及对所投票的帖子的引用:
  • {
        "_id": "ObjectId(xxxx)",
        "postId": ObjectId(xxxx),
        "userId": ObjectId(xxxx),
        "votedate": "03/03/2012+1436"
    }
    

    第一个是更多Documentey,但我不知道如何查询投票数组以获取最近24小时内投票最多的文档.

    The first one is more Documentey but I have no idea how to query the votes array to get the documents with the most votes in the last 24 hours.

    我倾向于第二个投票,因为它可以更容易地查询按我认为是按投票分组的投票计数,但是我不确定它的表现如何.这就是您在关系数据库中的处理方式,但是它似乎没有太多的文档记录-但是我不确定这是否有问题,是吗?

    I'm leaning towards the second one as it would be easier to query the vote count grouped by vote I think, but I'm not sure how well it would perform. This is how you'd do it in Relational Databases, but it doesn't seem very documenty - but I'm not sure if its a problem, is it?

    还是我将两者结合使用?我还会在每次页面加载时实时进行这种类型的聚合查询.还是我只是每分钟运行一次查询,然后将结果存储在查询结果集中?

    Or do I use a combination of the two? Also would I do this type of aggregate query in real-time, every page load. Or do I just run the query say once per minute and store the results in a query result collection?

    您将如何实现此架构?

    推荐答案

    跟踪总体选票计数的常用方法是保留发布文档中的选票数量,并在将新值推入新文档时自动更新选票数量.投票数组.

    The common way to track counts for votes overall would be to keep the number of votes in the post document and to update it atomically when pushing a new value to the votes array.

    由于它是单个更新,因此可以确保计数与数组中元素的数量匹配.

    Since it's a single update, you are guaranteed that the count will match the number of elements in the array.

    如果聚合的数量固定且站点非常繁忙,则可以扩展此范式并增加其他计数器,例如按月,日和小时计数,但这很快就会失去控制.因此,您可以改用新的聚合框架(在2.1.2开发版中可用,将会在2.2版中投入生产.它比Map/Reduce使用起来更简单,它可以让您非常简单地进行所需的计算,尤其是当您小心地将投票日期存储为ISODate()类型时.

    If the number of aggregations is fixed and the site is very busy you could extend this paradigm and increment additional counters, like one for month, day and hour, but that could get out of hand very quickly. So instead you could use the new Aggregation Framework (available in 2.1.2 dev release, will be in production in release 2.2. It is simpler to use than Map/Reduce and it will allow you to do the calculations you want very simply especially if you take care to store your vote dates as ISODate() type.

    本月最热门投票者的汇总查询的典型管道可能如下所示:

    Typical pipeline for aggregation query for top vote getters this month might look something like this:

    today = new Date();
    thisMonth = new Date(today.getFullYear(),today.getMonth());
    thisMonthEnd = new Date(today.getFullYear(),today.getMonth()+1);
    
    db.posts.aggregate( [
        {$match: { "Votes.votedate": {$gte:thisMonth, $lt:thisMonthEnd} } },
        {$unwind: "$Votes" },
        {$match: { "Votes.votedate": {$gte:thisMonth, $lt:thisMonthEnd} } },
        {$group: { _id: "$title", votes: {$sum:1} } },
        {$sort: {"votes": -1} },
        {$limit: 10}
    ] );
    

    这将管道的输入限制为具有投票的帖子,方法是将投票日期与您所计数的月份相匹配,展开"数组以使每个投票获得一个文档,然后对所有投票进行分组"等效对于每个标题(我假设标题是唯一的).然后,它按票数降序排列,并将输出限制为前十位.

    This limits the input to the pipeline to posts that have votes by matching vote dates to the month you are counting, "unwinds" the array to get one document per vote and then does a "group by" equivalent summing up all votes for each title (I'm assuming title is unique). It then sorts descending by number of votes and limits the output to first ten.

    您还可以按月(例如)汇总该月的选票,以查看哪些天最活跃进行投票:

    You also have the ability to aggregate votes by day (for example) for that month to see which days are most active for voting:

    db.posts.aggregate( [
        {$match: { "Votes.votedate": {$gte:thisMonth, $lt:thisMonthEnd} } },
        {$unwind: "$Votes" },
        {$match: { "Votes.votedate": {$gte:thisMonth, $lt:thisMonthEnd} } },
        {$project: { "day" : { "$dayOfMonth" : "$Votes.votedate" }  } },
        {$group: { _id: "$day", votes: {$sum:1} } },
        {$sort: {"votes": -1} },
        {$limit: 10}
    ] );
    

    这篇关于MongoDB Schema Design-投票的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆