mysql数据版本控制系统的最佳实践 [英] Best practice for a mysql data versioning system

查看:61
本文介绍了mysql数据版本控制系统的最佳实践的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须将像文章这样的数据存储到mysql数据库中,并且如果对文章进行了修改,我还必须保存旧版本,以便可以还原它.我已经找到了与此主题类似的问题和帖子,但是我不确定哪种解决方案是解决该问题的最佳方法.

I have to store data like articles into a mysql database and if an article is modified, I have to save the old version as well to make it possible to restore it. I have found some similar questions and posts on this topic, but I am not sure, which solution is the best to solve the problem.

以下是基本表文章",以便更好地理解:

Here is the basic table "articles" for better understanding:

文章(ID,名称,文本)

articles (id, name, text)

对我来说,有两种不同的处理方法:

For me, there are two different approaches for this:

将数据和文章的每个版本存储在文章"表中,并添加版本"和状态"列.在版本中,我存储文章的递增版本号.活动文章的状态为1,其他的状态为2.

Store the data and every version of an article in the table "articles" and add the columns "version" and "status". In version i store the incremented version number of the article. The active article gets the "status" 1 and the others the "status" 2.

专业人士:

  • 只需要一张桌子

  • Only one table is needed

新版本是新数据的插入,而只是旧数据的状态"列的更新

A new version is an insert of the new data and only an update of the "status"-column of the old one

骗局

  • 非常大的表(也许查询速度较慢?)

将字段版本"添加到文章",并且仅将活动数据存储到表文章"中.数据的旧版本存储/移动到新表"articles_versioned".

Add the field "version" to "articles" and store only the active data into the table "articles". Old versions of the data is stored / moved to the new table "articles_versioned".

专业人士:

  • 只有实际有效数据在表"articles"中

骗局

  • 表的复制

所以.我忘了一个好方法吗?如何处理其他表中的相关数据(如图像等)?

So. Have I forgotten a good aproach? How to deal with related data in other tables (like images, etc.)?

推荐答案

我的选择是方法2的一种变体.粗体表示主键中的字段.

My choice would be a variation of approach 2. Bold indicates fields in the primary key.

  • 您将每篇文章插入表articles_versioned( id 时间戳,名称,文本)
  • 您的第二个表是articles( id ,时间戳,[名称,文本]).注意时间戳不是主要的.名称和文本可以复制,也可以与articles_versioned一起使用(这很快,因为id和timestamp是articles_versioned主键)
  • articles_versioned在插入时有一个触发器,该触发器采用刚插入的行并将其复制到articles
  • 要还原文章的特定版本,请修改articles表.
  • You insert every article in a table articles_versioned (id, timestamp, name, text)
  • Your second table is articles (id, timestamp, [name, text]). Note how timestamp is not primary; name and text may be replicated, or you may use a join with articles_versioned (which will be fast since id and timestamp are the articles_versioned primary key)
  • articles_versioned has a trigger on insert that takes the just inserted row and replicates it on articles
  • To restore a specific version of an article you modify the articles table.

这种方法的优点是:

  1. 您在表格中免费获得了可能仍需要的其他信息(文章的日期和时间)
  2. 您无需查询数据库即可获取当前日期.如果使用版本,则必须这样做.
  3. 您的代码不必将文章插入两个表中.您只需插入articles_versioned并从articles中读取数据,db便会在您通过触发器插入数据时负责迁移数据,从而避免了任何一致性问题.
  1. You get for free another information (the date and time of the article) in your table, that you may need anyway
  2. You do not need to query the database to get the current date. If you use version, you have to.
  3. Your code doesn't have to insert the article in two tables. You simply insert in articles_versioned and read from articles, the db takes care of migrating data as you insert it via the trigger, avoiding any consistency problems.

骗子

  1. 在高度并发的环境中,可能会同时插入两个版本,因此其中一个可能会失败.插入用户撰写的文章时,这应该不是问题(鉴于目前的时间戳精确度,这不太可能出现).如果未在INSERT语句中指定时间戳,而是将datetime字段设置为将当前时间作为默认值,则可以完全避免此问题.
  1. In an heavily concurrent environment, two versions may be inserted at the very same time, so one of them may fail. This shouldn't be a problem when inserting user-written articles (it is highly unlikely given precision of timestamps these days). If you don't specify the timestamp in your INSERT statement, but instead you set the datetime field to have the current time as a default value, you may avoid this problem entirely.


回答其余的问题.只要您添加状态索引,方法1就不会导致更长的查询.仅当您倾向于每个文章有许多不同版本时,这才有意义.只要每篇文章平均有2个或更少的版本,索引只会使您变慢,并且方法2无论如何都不会明显加快(尽管我仍然推荐我的方法,因为它简单地编写了代码,因为还原版本确实可以不需要两行切换状态.


To answer the rest of your question. Approach 1 will not lead to longer queries as long as you add an index on status. This makes sense only if you tend to have many different versions of each article; as long as you have 2 versions per article on average or less, the index will only slow you down, and approach 2 would not be sensibly faster anyway (altough I'd still recommend my approach because it simplyfies code, since restoring a version does not require switching status for two rows).

相关资源(如图像)应遵循类似的版本控制.我假设您将它们保存在文件系统中;与其使用真实姓名保存它们,不如使用表( id ,image_name)为每个图像提供一个ID,然后将该图像另存为-id-.jpg. image_name字段将使您能够知道原始文件名是什么(如果您担心的话).这样,您就可以像对文章进行版本化一样对图像进行版本控制,并且在文章中您将使用诸如<img src="-id-.jpg">之类的东西,您知道它们将永远可用.

Related resources, like images, should follow a similar versioning. I assume you are saving them on the filesystem; instead of saving them with their real name, use a table (id, image_name) to give to each image an id, then save the image as -id-.jpg. The image_name field will make you able to know what the original file name was (if you care about that). This way you can version images the same way as you version articles, and in articles you would use something like <img src="-id-.jpg">, that you know will remain available forever.

这篇关于mysql数据版本控制系统的最佳实践的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆