与NoSQL数据库的关系 [英] Relational to NoSQL Database

查看:202
本文介绍了与NoSQL数据库的关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题是所有NoSQL和特别是mongoDB的专家。我开始设计一个项目的关系数据库,但客户端希望我们使用可以轻松扩展的数据库。为了实现这一点,我们决定使用mongoDB。这几天我无法映射我的NoSQL的关系模型。我有一个用户表与许多其他表具有多对多关系,如下所示:

This question is for all NoSQL and specially mongoDB experts out there. I started by designing a relational DB for a project but client wants us to use a DB that can easily scale. To achieve this we have decided to use mongoDB. These days I am having trouble mapping my relational model for NoSQL. I have a users table which has a many-to-many relation with a lot of other tables as illustrated below:

将mongoDB转换成几个选项:

I have a few options when converting it for mongoDB:

选项1(用户中包含完整的行):

users:{
  _id:<user_id>,
  battles:{[battle1, battle2, ...]},
  items:{[item1, item2, ...]},
  locations:{[location1, location2, ...]},
  units:{[unit1, unit2, ...]},
}

battles:{
  <battle_info>
}

locations:{
  <location_info>
}

units:{
  <units_info>
}

items:{
  <items_info>
}

Option2(用户只有外键):

users:{
  _id:<user_id>,
  battles:{[battle1_id, battle2_id, ...]},
  items:{[item1_id, item2_id, ...]},
  locations:{[location1_id, location2_id, ...]},
  units:{[unit1_id, unit2_id, ...]},
}

battles:{
  <battle_info>
}

locations:{
  <location_info>
}

units:{
  <units_info>
}

items:{
  <items_info>
}

选项3(其他表中的用户ID):

users:{
  _id:<user_id>,
}

battles:{
  <battle_info>,
  user:{[user1_id, user2_id, ...]}
}

locations:{
  <location_info>,
  user:{[user1_id, user2_id, ...]}
}

units:{
  <units_info>,
  user:{[user1_id, user2_id, ...]}
}

items:{
  <items_info>,
  user:{[user1_id, user2_id, ...]}
}

选项1有很多重复,因为我们正在添加完整的其他表的行。我看到的一个问题是,如果某个项目或战斗更新,我们将不得不在用户表中找到所有出现的内容,并更新它们。但是这给我们的优势是总是有一个完整的用户对象,可以在登录时交给客户端应用程序。

Option 1 has a lot of duplication as we are adding complete rows of other tables. One issue I see in this is that if a certain item or battle is updated, we will have to find all occurrences of it in users table and update them as well. But this gives us the advantage of always having a complete users object which can be handed to client application at time of login.

选项2更关系,我们只有mongoIds的其他表在用户表中。这个选项的优点是,更新战斗或项目没有太多的成本,因为行被引用不被复制。另一方面,当用户登录时,我们将不得不找到所有引用的单位,战斗,项目和位置,以响应完整的用户对象。

Option 2 is more relational where we only have mongoIds of other tables in users table. Advantage of this options is that updating a battle or item doesn't have much cost as rows are referenced not copied. On the other hand, when user logs in we will have to find all referenced units, battles, items, and locations to respond with a complete users object.

选项3是与选项2相反,其中用户表的mongoIds保存在其他表中。这个选项对我来说并不吸引人。

Option 3 is opposite to option 2 where mongoIds of users table are kept in other tables. This option doesn't appeal much to me.

我真的很感激有人可以指导我或想出一个更好的模型。

I would really appreciate of someone can guide me or come up with a better model.

编辑:

基本上这是一个mmorpg游戏,多个客户端应用程序将通过Web服务连接到服务器。我们在客户端有一个本地数据库来存储数据。我想要一个模型,通过该模型,服务器可以用完整的用户对象进行响应,然后更新或插入客户端应用程序上更改的数据。

Basically this is an mmorpg game where multiple clients apps will connect to server through webservices. We have got a local db at client to store data. I want a model through which server can respond with a complete user object and then update or insert data changed on client apps.

推荐答案

首先,NoSQL不是一种尺寸适合所有。在SQL中,几乎每1:N和M:N关系以相同的方式建模。 NoSQL的理念是,您对数据建模的方式取决于数据及其使用模式。

First, NoSQL is not one size fits all. In SQL, almost every 1:N and M:N relation is modeled in the same way. The NoSQL philosophy is that the way you model the data depends on the data and its use patterns.

其次,我同意Mark Ba​​ker:缩放很难,而且实现了通过松动约束。这不是技术问题。我喜欢与MongoDB一起工作,但是由于其他原因(不需要编写丑陋的SQL;不需要复杂的,blo肿的ORM等)。

Second, I agree with Mark Baker: Scaling is hard, and it's achieved by loosening constraints. It's not a technology matter. I love working with MongoDB, but for other reasons (no need to code ugly SQL; no need for complicated, bloated ORM; etc.)

现在我们来看看你的选项:
选项1 复制比所需更多的数据。你经常需要对某些数据进行非规范化处理,但从来不需要它们。如果是这样,获取引用对象便宜了。

Now let's review your options: Option 1 copies more data than needed. You will often have to denormalize some data, but never all of it. If so, it's cheaper to fetch the referenced object.

选项2/3 他们非常相似。这里的关键是:谁在写作?您不希望许多客户端具有对同一文档的写入访问权限,因为这将迫使您使用锁定机制,和/或仅限制自己修饰操作。因此,选项2可能比3好。但是,如果A攻击B,他们也会触发对用户B的写入,所以你必须确保你的写作是安全的。

Option 2/3 they are very similar. The key here is: who's writing? You don't want a lot of clients having write-access to the same document, because that will force you to use a locking mechanism, and/or restrict yourself to modifier operations only. Therefore, option 2 is probably better than 3. However, if A attacks B, they'd also trigger a write to user B, so you have to make sure your writes are safe.

选项4 部分非规范化:您的用户对象似乎是最重要的,因此如何:

Option 4 Partial denormalization: Your user object seems to be most important, so how about this:

user { 
 battles : [ {"Name" : "The battle of foo", "Id" : 4354 }, ... ]
 ...
}

这样可以更容易地显示例如用户信息板,因为您不需要知道仪表板中的所有详细信息。注意:数据结构然后与演示文稿的细节相结合。

This will make it easier to show e.g. a user dashboard, because you don't need to know all the details in the dashboard. Note: the data structure is then coupled to details of the presentation.

选项5 边缘数据。通常,关系也需要保存数据:

Option 5 Data on edges. Often, the relation needs to hold data as well:

user {
 battles : [ {"Name" : "The battle of foo", "unitsLost" : 54, "Id" : 34354 }, ... ]
}

这里, unitsLost 特定于用户和战斗,因此数据位于图形的边缘。与战斗名称相反,这些数据不是非规范化的。

here, unitsLost is specific to the user and the battle, hence the data sits on the edge of the graph. Contrary to the battle's name, this data is not denormalized.

选项6 链接器集合。当然,这种边缘数据可能会变得巨大,甚至可能需要单独的集合(链接器集合)。这完全消除了访问锁的问题:

Option 6 Linker collections. Of course, such 'edge-data' can grow huge and might even call for a separate collection (linker collection). This fully eliminates the problem of access locks:

user { 
  "_id" : 3443
}

userBattles {
  userId : 3443,
  battleId : 4354,
  unitsLost : 43,
  itemsWon : [ <some list > ],
  // much more data
}

哪些是最好的取决于你的应用程序的很多细节。如果用户进行了大量的点击(即,您有一个细粒度的界面),那么拆分像选项4或6中的对象是有意义的。如果您真的需要一个批次中的所有数据,那么部分非规范化就没有帮助选项2将更为可取。记住多个作者问题。

Which of these is best depends on a lot of details of your application. If users make a lot of clicks (i.e. you have a fine-grained interface), it makes sense to split up objects like in option 4 or 6. If you really need all data in one batch, partial denormalization doesn't help, so option 2 would be preferable. Keep in mind the multiple writer problem.

这篇关于与NoSQL数据库的关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆