用户评级架构-键/值数据库 [英] Schema for User Ratings - Key/Value DB

查看:71
本文介绍了用户评级架构-键/值数据库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在使用MongoDB,并且正在确定用于存储Ratings的架构.

We're using MongoDB and I'm figuring out a schema for storing Ratings.

  • 评分的值为1-5.
  • 我想存储其他值,例如fromUser

这很好,但是我主要的问题是对其进行设置,以便重新计算平均值尽可能有效.

This is fine but the main question I have is setting it up so that recalculating the average is as efficient as possible.

解决方案1-单独的评分等级

SOLUTION 1 - Separate Ratings Class

首先想到的是创建一个单独的Ratings类,并在User类中存储指向Ratings的指针数组.我第二次猜测的原因是,每次有新的Rating出现时,我们都必须查询所有Ratings对象,以便我们可以重新计算平均值

The first thought was to create a separate Ratings class and store an array of pointers to Ratings in the User class. The reason I second guessed this is that we will have to query for all of the Ratings objects every time a new Rating comes in so that we can recalculate an average

...

解决方案2-用户类别中的词典

SOLUTION 2 - Dictionary in User Class

第二个想法是直接将字典存储在User类中,该字典将存储这些Ratings对象.这将比解决方案1轻一些,但是每次更新时,我们都将重写每个用户的整个Ratings历史记录.这似乎很危险.

The second thought was to store a dictionary in the User class directly that would store these Ratings objects. This would be slightly more lightweight than Solution 1, but we'd be re-writing the entire Ratings history of each user every time we update. This seems dangerous.

...

解决方案3-单独的评分等级和用户类别中的平均评分

SOLUTION 3 - Separate Ratings Class with Separate Averages in User Class

混合选项,在我们自己的类中有Ratings,并有一个指向它们的指针数组,但是,我们在用户类中保留了两个值-ratingsAveratingsCount.这样,当设置新的Rating时,我们可以保存该对象,但可以轻松地重新计算ratingsAve.

Hybrid option where we have Ratings in their own class, and a pointer array to them, however, we keep two values in the User Class - ratingsAve and ratingsCount. This way when a new Rating is set we save that object but we can recalculate the ratingsAve easily.

对我来说,解决方案3听起来最好,但我只是想知道我们是否需要通过重新查询评级"历史记录以重置ratingsAve来确保进行所有检查的方式来包括定期校准.

SOLUTION 3 sounds best to me but I'm just wondering if we'd need to include periodic calibrations by requerying the Ratings history to reset the ratingsAve just to make sure everything checks out.

我可能对此有过高的想法,但是我在数据库模式创建方面并不出色,这似乎是一个标准的模式问题,我应该知道如何实现.

I might be overthinking this but I'm not that great at DB schema creation, and this seems like a standard schema issue that I should know how to implement.

哪种方法既可以确保一致性,又可以确保重新计算的效率?

Which is the best option to ensure consistency but also efficiency of recalculation?

推荐答案

首先,用户类词典"不是一个好主意.为什么?添加额外的费率对象需要将一个新项目推送到数组中,这意味着旧项目将被删除,这种插入称为"移动文档".移动文档的速度很慢,Mon​​goDB在重用空白空间方面并不是很出色,因此,大量移动文档可能会导致大量的空白数据文件(《 MongoDB The Definitive Guide》一书中的某些文本).

First of all 'Dictionary in User Class' is not a good idea. why? Adding extra rate object requires pushing a new item to the array, which implies the old item will be removed, and this insertion is so called "moving a document". Moving documents is slow and MongoDB is not so great at reusing empty space, so moving documents around a lot can result in large swaths of empty data file (some text in 'MongoDB The Definitive Guide' book).

那么正确的解决方案是什么:假设您有一个名为Blogs的集合,并且想要为您的Blog帖子实施评分解决方案,并希望跟踪每个基于用户的评分操作.

Then what is the correct solution: assume you have a collection named Blogs, and want to implement a rating solution for your blog posts, and additionally keep track of every user-based rate operation.

博客文档的架构如下:

{
   _id : ....,
   title: ....,
   ....
   rateCount : 0,
   rateValue : 0,
   rateAverage: 0
}

您需要具有此文档架构的另一个集合(费率):

You need another collection (Rates) with this document schema:

{
    _id: ....,
    userId: ....,
    postId:....,
    value: ..., //1 to 5
    date:....   
}

您需要为其定义一个正确的索引:

And you need to define a proper index for it:

db.Rates.ensureIndex({userId : 1, postId : 1})// very useful. it will result in a much faster search operation in case you want to check if a user has rated the post previously

用户要评分时,首先需要检查用户是否对该帖子进行了评分.假设用户为'user1',则查询为

When a user wants to rate, firstly you need to check whether the user has rated the post or not. assume the user is 'user1', the query then would be

var ratedBefore = db.Rates.find({userId : 'user1', postId : 'post1'}).count()

并且基于ratedBefore,如果 !ratedBefore ,则将新的费率文档插入到Rates集合中并更新博客状态,否则,不允许用户进行评分

And based on ratedBefore, if !ratedBefore then insert new rate-document to Rates collection and update blog status, otherwise, user is not allowed to rate

if(!ratedBefore)
{
    var postId = 'post1'; // this id sould be passed before by client driver
    var userId = 'user1'; // this id sould be passed before by client driver
    var rateValue = 1; // to 5
    var rate = 
    {       
       userId: userId,
       postId: postId,
       value: rateValue,
       date:new Date()  
    };

    db.Rates.insert(rate);
    db.Blog.update({"_id" : postId}, {$inc : {'rateCount' : 1, 'rateValue' : rateValue}});
}

那么rateAverage将会发生什么? 我强烈建议根据客户端的rateCountrateValue计算它,用mongoquery更新rateAverage很容易,但是您不应该这样做.为什么?简单的答案是:对于客户来说,这是一件非常容易的事情,要处理这些工作,并且将平均每个博客文档上的内容都需要进行不必要的更新操作.

Then what is gonna happen to rateAverage? I strongly recommend to calculate it based on rateCount and rateValue on client side, it is easy to update rateAverage with mongoquery, but you shouldn't do it. why? The simple answer is: this is a very easy job for client to handle these kind of works and putting average on every blog document needs an unnecessary update operation.

平均查询量将计算为:

var blog = db.Blog.findOne({"_id" : "post1"});
var avg = blog.rateValue / blog.rateCount;
print(avg);

通过这种方法,您将在mongodb上获得最佳性能,并且可以根据用户,帖子和日期跟踪每种费率.

With this approach you will get maximum performance with mongodb an you have track of every rate based by user, post and date.

这篇关于用户评级架构-键/值数据库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆