RethinkDB调查建模 [英] RethinkDB Survey Modelling

查看:150
本文介绍了RethinkDB调查建模的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开始一个新项目(调查应用程序),我选择RethinkDB作为我的数据库;然而,我有一些关于数据建模的问题。我会有问题,每个用户只能回答一次。此外,我会有报告,将告诉选择每个选项的用户的百分比。起初,我想到了以下建模:

I'm starting a new project (a survey application) and I chose RethinkDB as my database; however, I have some questions about data modelling. I'm going to have questions that can be answered only once by each user. Moreover I'll have reports that will tell the percentage of users that chose each option. At first I thought of the following modelling:

{
title:String,
total_answers:Number,
选项:[{
value:Number,
label:String,
respondents:[User IDs]
}]
}

问题是RethinkDB建议在一个数组上只嵌入几百个项目,每个调查可能有500多个受访者。
另一个选项是创建一个答案表,并链接到问题ID和选项,但报告查询可能是一个问题,因为我将有许多问题的每个调查。

The problem is that RethinkDB suggests embedding only a few hundred items on an Array and I'll probably have 500+ respondents per survey. The other option is to create an answers table and link to the question id and option but reports queries may be a problem as I'll have many questions per survey.

我应该遵循哪条路径?
谢谢!

Which path should I follow? Thanks!

推荐答案

首先,这是嵌入和连接之间的数据建模的经典。由于它是经典,请记得我们已经有了什么,供参考:

First of all, this is a classic of data modeling between embedding vs join. Since it's classic, let recall what we already have, for reference:

  • https://rethinkdb.com/docs/data-modeling/
  • http://openmymind.net/Multiple-Collections-Versus-Embedded-Documents/
  • MongoDB relationships: embed or reference?

在我们开始之前,让我们同意每个解决方案都有自己的强大和弱点。

Before we go ahead, let's agree that each solution has its own strong and weakness.

对你的问题,如你写的,嵌入有自己的问题。嵌入需要将整个文档加载到内存中。您在其上运行的任何查询将加载整个文档。此外,当您更改为 options.respondents 数组时,RethinkDB将重写整个文档。您还将有许多用户回答调查,这将被添加到 options.respondents 在同一时间。这意味着大量的写作。

Now back to your question, as you wrote, embedding has its own issues. Embedding requires load the whole document into memory. Any queries you run on it will loaded the whole document. Also, when you change to the options.respondents array, RethinkDB will rewrite the whole document. You also will have many users answer the survey, which will be added to options.respondents at the same time. That means lots of write.

在我看来,嵌入式是好的数据,不需要站在自己的应用范围。意味着始终与其父对象一起使用的数据,很少需要访问自己的数据分离,这是很好的嵌入。

In my opinion, embedded is good for data that don't need to stand on it own, in the scope of application. Meaning the data that always to be used with its parent, and rarely we need access that own data separatly, it's good to embed.

对于需要频繁访问的数据,它自己应该属于其他表。使用 JOIN 运行报告,合并结果。

For the data that needs frequent access, on its own, should belong to other table. And using JOIN to run reports, merge result.

一个标志,你应该把它分开。它提供了很大的灵活性,因为你有自己的表上的东西,你不必深入数组,并转换数据。

As you write, you do want to run queries report, that a sign that you should put it separate. It gives great flexibility because you have thing on its own table, you don't have to dig into the array, and transform the data.

RethinkDB支持JOIN,可以使用具有索引 eqJoin concatMap getAll 也与索引,使查询更高效。对于你的用例,我会说JOIN放手。

RethinkDB supports JOIN, and you can use eqJoin with index, or concatMap and getAll, also with index, make query more efficient. For you use case, I will say let go with JOIN.

分离事物可能更容易运行一些聚合。如计算系统中的用户数量在一年中的第一季度参与冲浪。

Separating thing out maybe easier to run some aggregation. Such as count the number of users in system participate in surverys in first quarter of the year.

我还是不清楚你的数据类型,如果你可以更新你的问题,你想要放入什么样的数据,我可以帮助创建一个围绕它的数据模型。

I still don't have an clear of idea of the kind of data that you have, if you can update your question about what kind of data you want to put in, I can help to create a data model around it.

这篇关于RethinkDB调查建模的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆