时态数据库设计,稍有不同(实时行与草稿行) [英] Temporal database design, with a twist (live vs draft rows)

查看:18
本文介绍了时态数据库设计,稍有不同(实时行与草稿行)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在考虑实现对象版本控制,并增加需要同时拥有实时对象和草稿对象的方法,并且可以使用来自某人在这方面的经验的见解,因为我开始怀疑这是否可能没有潜在的可能性可怕的黑客.

I'm looking into implementing object-versioning with the added twist of needing to have both live and draft objects, and could use the insights from someone experience in this, as I'm beginning to wonder if it's even possible without potentially horrific hacks.

为了示例,我会将其分解为带有标签的帖子,但我的用例更通用(涉及缓慢变化的维度 - http://en.wikipedia.org/wiki/Slowly_changed_dimension).

I'll break it down to posts with tags for the sake of the example, but my use-case is a bit more general (involving slowly changing dimensions - http://en.wikipedia.org/wiki/Slowly_changing_dimension).

假设您有一个帖子表、一个标签表和一个 post2tag 表:

Suppose you've a posts table, a tags table, and a post2tag table:

posts (
 id
)

tags (
 id
)

post2tag (
 post_id fkey posts(id),
 tag_id fkey tags(id)
)

我需要一些东西:

  1. 能够准确显示帖子在任意日期时间的样子,包括删除的行.
  2. 跟踪谁在编辑什么,以获得完整的审计跟踪.
  3. 需要一组物化视图(实时"表)以保持参照完整性(即日志记录应该对开发人员透明).
  4. 需要对实时最新的草稿行保持适当的速度.
  5. 能够将草稿帖子与实时帖子共存.
  1. Being able to show exactly how a post looked like at an arbitrary datetime, including for deleted rows.
  2. Keep track of who is editing what, for a complete audit trail.
  3. Needs a set of materialized views ("live" tables) for the sake of keeping referential integrity (i.e. logging should be transparent to the developers).
  4. Needs to be appropriately fast for live and the latest draft rows.
  5. Being able to have a draft post coexist with a live post.

我一直在研究各种选项.到目前为止,我想出的最好的(没有点 #4/#5)看起来有点像 SCD type6-hybrid 设置,但不是有当前的布尔值,而是当前行的物化视图.出于所有意图和目的,它看起来像这样:

I've been investigating various options. So far, the best I've come up with (without points #4/#5) looks a bit like the SCD type6-hybrid setup, but instead of having a current boolean there's a materialized view for the current row. For all intents and purposes, it looks like this:

posts (
 id pkey,
 public,
 created_at,
 updated_at,
 updated_by
)

post_revs (
 id,
 rev pkey,
 public,
 created_at,
 created_by,
 deleted_at
)

tags (
 id pkey,
 public,
 created_at,
 updated_at,
 updated_by
)


tag_revs (
 id,
 public,
 rev pkey,
 created_at,
 created_by,
 deleted_at
)

post2tag (
 post_id fkey posts(id),
 tag_id fkey tags(id),
 public,
 created_at,
 updated_at,
 updated_by
)

post2tag_revs (
 post_id,
 tag_id,
 post_rev fkey post_revs(rev), -- the rev when the relation started
 tag_rev fkey tag_revs(rev), -- the rev when the relation started
 public,
 created_at,
 created_by,
 deleted_at,
 pkey (post_rev, tag_rev)
)

我正在使用 pg_temporal 来维护周期(created_at、deleted_at)上的索引.我使用触发器使各种表保持同步.Yada yada yada... 我创建了触发器,允许取消对帖子/标签的编辑,这样草稿就可以存储到 rev 中而不被发布.效果很好.

I'm using pg_temporal to maintain indexes on period(created_at, deleted_at). And I keep the various tables in sync using triggers. Yada yada yada... I created the triggers that allow to cancel an edit to posts/tags in such a way that the draft gets stored into the revs without being published. It works great.

除非我需要担心 post2tag 上的草稿行相关关系.在那种情况下,一切都变得混乱了,这向我暗示我在那里遇到了某种设计问题.但我的想法不多了...

Except when I need to worry about draft-row related relations on post2tag. In that case, all hell breaks loose, and this hints to me that I've some kind of design problem in there. But I'm running out of ideas...

我考虑过引入数据重复(即为每个修订草案引入 n 个 post2tag 行).这种工作,但往往比我希望的要慢得多.

I've considered introducing data duplication (i.e. n post2tag rows introduced for each draft revision). This kind of works, but tends to be a lot slower than I'd like it to be.

我曾考虑为最后一次草稿"引入草稿表,但这很快就会变得非常难看.

I've considered introducing drafts tables for the "last draft", but this quickly tends to become very very ugly.

我考虑过各种标志...

I've considered all sorts of flags...

那么问题:在行版本控制的环境中,是否有一种普遍接受的方法来管理实时行与非实时行?如果没有,您尝试过并取得相当成功的方法是什么?

So question: is there a generally accepted means of managing live vs non-live rows in a row-version controlled environment? And if not, what have you tried and been reasonably successful with?

推荐答案

我想我已经搞定了.基本上,您将一个(唯一的)草稿字段添加到相关表中,然后您就可以像处理新帖子/标签/等一样处理草稿:

I think I nailed it. Basically, you add a (unique) draft field to the relevant tables, and you work on the drafts as if they were a new post/tag/etc.:

posts (
 id pkey,
 public,
 created_at stamptz,
 updated_at stamptz,
 updated_by int,
 draft int fkey posts (id) unique
)

post_revs (
 id,
 public,
 created_at,
 created_by,
 deleted_at,
 pkey (id, created_at)
)

tags (
 id pkey,
 public,
 created_at,
 updated_at,
 updated_by,
 draft fkey tags (id) unique
)


tag_revs (
 id,
 public,
 created_at,
 created_by,
 deleted_at,
 pkey (id, created_at)
)

post2tag (
 post_id fkey posts(id),
 tag_id fkey tags(id),
 public,
 created_at,
 updated_at,
 updated_by,
 pkey (post_id, tag_id)
)

post2tag_revs (
 post_id,
 tag_id,
 public,
 created_at,
 created_by,
 deleted_at,
 pkey (post_id, tag_id, created_at)
)

这篇关于时态数据库设计,稍有不同(实时行与草稿行)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆