数据库 - 单个表中的数据版本 [英] Database - data versioning in single table

查看:112
本文介绍了数据库 - 单个表中的数据版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个具有一些版本控制功能的CMS。它基于MySQL Db。



这个想法是向公众网站访问者显示数据的特定修订,并为后台用户预览最新修订版。发布内容仅仅意味着将某个修订版本设置为等于最新版本(并且可能删除旧版本的数据)。



我已经阅读了关于SO的一些问题,大多数人建议在同一个表中保留旧和新行是坏的。但是,由于我需要连接表,所有的版本化,拆分旧的和新的在不同的表是不理想的(应该如何知道如果内容从一个修订是旧的或新的,因此是在_history表中找不到?)。



所以我决定为每个内容类型只使用一个表。



我使用的设计:
每个表都持有一个revision INT NOT NULL列(主键的一部分,以及一个ID列)。



修改内容意味着使用修改的值(增量修订版本,但ID相同)插入新行。



插入某事意味着插入一个增加ID并增加修订版本的新行。



删除某事意味着插入具有相同ID的空行,增加修订版本和设置thumbstone到true。



示例:有页面,并有views(视图不在MVC意义上,视图在应用程序特定意义
这是(部分)视图。

  CREATE TABLE` _views`(
`_id` int(11)NOT NULL,
`_rev` int(11)NOT NULL,
`_ts` BIT(1)DEFAULT b'0',
`page` int(11)NOT NULL,
`order` int(11)NOT NULL,
PRIMARY KEY(`_id`,`_rev`)

我需要选择页面包含的所有视图,直到某个版本订单。



此查询适用于:

  SELECT * FROM 
SELECT *
FROM`_views`
WHERE`page` =:page
AND`_rev`< =:revision
ORDER BY`_rev` DESC
)AS`all`
GROUP BY`_id`
HAVING`_ts` = 0
ORDER BY`order`

子查询选择曾经发布(修订版本小于或等于已发布修订版本)的页面的所有视图。外部查询将它们分组为最新修订版本,删除具有缩略图的组,并按应用程序特定条件排序。



因为CMS的可扩展性和性能至关重要,是不是有比子查询更好,更优雅的方式?



...或者我应该只关注缓存吗?

解决方案

使用子查询来确定当前版本不是最佳方法;你真的不想去那里。



一个更简单的方法是添加一个标志,告诉你最新的修订版本:

 `_rev` int(11)NOT NULL,
`_current` BIT(1),
/ pre>

这需要手动更新来设置 _current 标志, c $ c> _ts 标志已更改。但至少避免在每个

页面显示上执行子查询。



作为替代,您仍然可以将数据拆分为 _current _history 表。如果您需要再次连接结果集,那么您可以为这两种情况创建视图:

  CREATE VIEW pages_all AS 
SELECT * FROM pages_current
UNION ALL SELECT * FROM pages_history

可能会创建所有活动(非缩略)修订版本的子表,如果您需要频繁分组。虽然这将导致比_current标志甚至更多的手动微管理,或只是一个视图的_history表。


I'm developing a CMS which has some version control features. It's based on a MySQL Db.

The idea is to show public site visitors a "certain revision" of the data and backoffice users a preview of the "latest revision". Publishing something just means to set the "certain revision" equal to the latest one (and maybe deleting data of old revisions).

I've read some Q&As about the topic on SO, most of them suggest that holding "old" and "new" rows in the same table is bad. But, since I need to join tables, all of them "versioned", splitting old and new in different tables isn't ideal either (how should the app know if "content" from one revision is old or new, and hence to be found in a "_history" table or not?).

So I decided to use just one table for each "content type".

The design I used: every table holds a "revision INT NOT NULL" column (part of primary key, together with an ID column).

Modifying something means inserting a new row with the modified values, an incremented revision, but the same ID.

Inserting something means inserting a new row with incremented ID and incremented revision.

Deleting something means inserting an empty row with same ID, incremented revision and a "thumbstone" flag set to "true".

Example: there are pages and there are "views" ("view not in MVC sense, view in an application specific meaning). "views" are versioned. One page has many views. This is (part of) "Views".

CREATE TABLE `_views` (
  `_id` int(11) NOT NULL,
  `_rev` int(11) NOT NULL,
  `_ts` BIT(1) DEFAULT b'0',
  `page` int(11) NOT NULL,
  `order` int(11) NOT NULL,
  PRIMARY KEY (`_id`,`_rev`)
)

I need to select all views that a page contains, up to a "certain revision", in the order specified by "order".

This query works:

SELECT * FROM (
 SELECT *
 FROM `_views`
 WHERE `page` = :page
 AND `_rev` <= :revision
 ORDER BY `_rev` DESC
) AS `all`
GROUP BY `_id`
HAVING `_ts` = 0
ORDER BY `order`

the subquery selects all views of a page, that were once "published" (which revision is less than or equal to the "published" revision). The outer query groups them to their latest revision, removes the groups that have a thumbstone and orders them by application specific criteria.

Since for a CMS scalability and performance is crucial, isn't there a better, more elegant, way than subqueries?

... or should I just focus on caching?

解决方案

Using subqueries to determine the current revision is not the best approach; you really don't want to go there.

A simpler method is to add a flag which tells you about the most current revision:

   `_rev` int(11) NOT NULL,
   `_current` BIT(1),

This requires a manual UPDATE to set the _current flag whenever a new revision is added or the _ts flag changed. But at least that avoids executing the subquery on each page display.

As alternative you could still split your data into a _current and _history table. You'd then instead just create a view on both for those cases were you need to join result sets again:

 CREATE VIEW pages_all AS
      SELECT * FROM pages_current
      UNION ALL SELECT * FROM pages_history

Likewise it might be possible to create a subtable of all active (non-thumbstoned) revisions, if you need to group them frequently. Albeit that would incur even more manual micromanagement than a _current flag, or just a view over the _history table.

这篇关于数据库 - 单个表中的数据版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆