文档数据库中模式更改的模式 [英] Patterns for Schema Changes in Document Databases

查看:131
本文介绍了文档数据库中模式更改的模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我开始之前,我想为
类型的我的问题道歉 - 我相信一本整本书
可以写在该特定主题。



假设您有一个大文档数据库,其中包含多个文档模式
和每个模式的数百万个文档。
在应用程序的生命周期中,需要频繁地改变已经存储的文档的模式
(和内容)。




$ b

  • 添加新字段

  • 重新计算字段值/ li>
  • 删除字段

  • 将字段移动到嵌入文档中



我最后一个项目,我们使用一个SQL数据库,我们有一些非常相似的挑战
这导致一些显着的离线时间(一个24/7的产品)时,
的变化变得激烈的SQL DB通常在
更改发生时对表执行LOCK。我想避免这种情况。



另一个相关的问题是如何处理从
使用的编程语言环境中的模式更改。通常模式改变发生
改变类定义(我将使用Mongoid一个OR-Mapper为
MongoDB和Ruby)。

解决方案

如何处理旧版本的文档,不要
符合我最新的类定义。是一个很好的问题。



作为MongoDB的面向文档的数据库的好处是来自同一个集合的文档不需要具有相同的字段。具有不同的字段本身不会引起错误。这称为灵活性。这也是一个坏的部分,同样的原因。



所以问题和解决方案都来自你的应用程序的逻辑。



假设我们有一个模型Person,我们要添加一个字段。目前在数据库中我们有5.000.000人保存。问题是:我们如何添加该字段并减少停机时间?



可能的解决方案:


  1. 更改应用程序的逻辑,以便它可以处理具有该字段的人员和没有该字段的人员。


  2. p>

  3. 使用新逻辑更新生产部署。

  4. >
  5. 运行脚本。


所以唯一的停机时间是几秒钟需要重新部署。但是,我们需要花费时间与逻辑。



所以基本上我们需要选择哪些是更有价值的正常运行时间或我们的时间。



现在假设我们要重新计算一个字段,例如VAT值。我们不能像以前一样做,因为有一些产品增值税A和其他增值税B是没有意义的。



所以,一个可能的解决办法是:


  1. 更改应用程序的逻辑,使其显示VAT值正在更新,并禁用可以使用它的操作,如购买。


  2. 重新部署新版本


  3. 运行脚本。完成后:


  4. 使用完整的操作代码重新部署


所以没有绝对的停机时间,而只是部分关闭一些细节部分。用户可以继续查看产品的描述并使用应用程序的其他部分。



现在让我们说,我们要删除一个字段。



现在,将字段移动到嵌入文档中;这是一个好的!该过程将类似于第一个。但是,不是检查字段的存在,我们需要检查它是一个嵌入的文档还是一个字段。



结论是,使用面向文档的数据库,您具有很大的灵活性。所以你有优雅的选择在你的手。是否使用它取决于你是否更多的开发时间或你的客户的时间。


before I start I'd like to apologize for the rather generic type of my questions - I am sure a whole book could be written on that particular topic.

Lets assume you have a big document database with multiple document schemas and millions of documents for each of these schemas. During the life time of the application the need arises to change the schema (and content) of the already stored documents frequently.

Such changes could be

  • adding new fields
  • recalculating field values (split Gross into Net and VAT)
  • drop fields
  • move fields into an embedded document

I my last project where we used a SQL DB we had some very similar challanges which resulted in some significant offline time (for a 24/7 product) when the changes became to drastic as SQL DBs usually do a LOCK on a table when changes occur. I want to avoid such a scenario.

Another related question is how to handle schema changes from within the used programming language environment. Usually schema changes happen by changing the Class definition (I will be using Mongoid a OR-Mapper for MongoDB and Ruby). How do I handle old versions of documents that do not conform any more to my latest Class definition.

解决方案

That is a very good question.

The good part of document oriented databases as MongoDB is that documents from the same collection doesn't need to have the same fields. Having different fields do not raise an error, per se. It's called flexibility. It also a bad part, for the same reasons.

So the problem and also the solution comes from the logic of your application.

Let say we have a model Person and we want to add a field. Currently in the database we have 5.000.000 people saved. The problem is: How do we add that field and have the less downtime?

Possible solution:

  1. Change the logic of the application so that it can cope with both a person with that field and a person without that field.

  2. Write a task that add that field to each person in the database.

  3. Update the production deployment with the new logic.

  4. Run the script.

So the only downtime is the few seconds that it takes to redeploy. Nonetheless, we need to spend time with the logic.

So basically we need to choose which is more valuable the uptime or our time.

Now let say we want to recalculate a field such as the VAT value. We can not do the same as before, because having some products with VAT A and other with VAT B doesn't make sense.

So, a possible solution would be:

  1. Change the logic of the application so that it shows that the VAT values are being updated and disable the operations that could use it, such as buys.

  2. Write the script to update all the VAT values.

  3. Redeploy with the new code.

  4. Run the script. When it finish:

  5. Redeploy with the full operation code.

So there is not absolute downtime, but just partial shutdown of some specifics part. The user could keep seeing the description of products and using the other parts of the application.

Now let say, that we want to drop a field. The process would be pretty much the same as the first one.

Now, moving fields into embed documents; that's is a good one! The process would be similar to the first one. But instead of checking the existence of the field we need to check if it is a embedded document or a field.

The conclusion is that with a document oriented database you have a lot of flexibility. And so you have elegant options at your hands. Whether you use it or not depends or whether you value more you development time or your client's time.

这篇关于文档数据库中模式更改的模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆