关于 Postgres track_commit_timestamp (pg_xact_commit_timestamp) 的问题 [英] Questions about Postgres track_commit_timestamp (pg_xact_commit_timestamp)

查看:29
本文介绍了关于 Postgres track_commit_timestamp (pg_xact_commit_timestamp) 的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在设计一个并发安全的增量聚合汇总系统,track_commit_timestamp (pg_xact_commit_timestamp) 听起来很完美.但是我发现很少有关于它的评论,并且无法从源代码中弄清楚它是如何工作的.

I'm working on a design for a concurrency-safe incremental aggregate rollup system,and track_commit_timestamp (pg_xact_commit_timestamp) sounds perfect. But I've found very little commentary on it generally, and couldn't figure out how it works in detail from the source code.

希望有人知道我的一个或多个问题的答案:

Hopefully, someone knows the answers to one or more of my questions:

  • 提交时间戳功能是否有可能产生乱序时间?我所追求的是一种识别自特定时间以来已更改的记录的方法,以便我可以获取任何以后的更改以进行处理.如果有相同的时间戳,我不需要它们以完美的提交顺序.

  • Is it possible for the commit timestamp feature to produce times out of order? What I'm after is a way to identify records that have been changed since a specific time so that I can get any later changes for processing. If there are identical timestamps, I don't need them in perfect commit sequence.

最终实现中每行增加了多少字节?我看到的讨论似乎在 12-24 字节之间.曾讨论过为以防万一"添加额外字节.这是 9.5 之前的版本,所以是一个世界以前的版本.

How many bytes are added to each row in the final implementation? The discussions I saw seemed to be ranging from 12-24 bytes. There was discussion of adding in extra bytes for "just in case." This is pre 9.5, so a world ago.

时间戳是否在内部编入索引?用 B 树?我询问容量规划方面的原因.

Are the timestamps indexed internally? With a B-tree? I ask for capacity-planning reasons.

我在 StackOverflow 和设计讨论中看到时间戳不是无限期保存的,但无法找到它们确切存储多长时间的详细信息.

I've seen on StackOverflow and the design discussions that the timestamps are not kept indefinitely, but can't find the details on exactly how long they are stored.

关于启用 track_commit_timestamp 对性能影响的任何经验法则?我不需要所有表上的数据,但在我需要的地方,听起来它可能完美无缺.

Any rules of thumb on the performance impact of enabling track_commit_timestamp? I don't need the data on all tables but, where I do, it sounds like it might work perfectly.

有什么问题吗?我尝试在测试表上运行 VACUUM FULL 并且 pg_xact_commit_timestamp 都没有改变.看起来像 VACUUM 这样的物理操作不应该改变任何东西,但很容易有一些我没有想到的东西.而且,老实说,我的快速 VACUUM 测试甚至可能没有任何意义.

Any gotchas? I tried running VACUUM FULL on a test table and none of the pg_xact_commit_timestamp changed. It seems like a physical operation like VACUUM shouldn't change anything, but there could easily be something I've not thought of. And, honestly, my quick VACUUM test might not even mean anything.

非常感谢您的帮助!

我编辑了我的问题以阐明我要完成的任务,我希望根据更新标记跟踪已处理和未处理的数据.

I've edited my question to clarify what I'm trying to accomplish, I'm looking to track processed and unprocessed data based on update stamps.

select max(pg_xact_commit_timestamp(xmin)) from scan;--   2019-07-07 20:46:14.694288+10

update scan set quantity = 5 where quantity = 1; --       Change some data.

select max(pg_xact_commit_timestamp(xmin)) from scan; --  2019-07-10 09:38:17.920294+10

-- Find the changed row(s):
select * 
  from scan 
 where pg_xact_commit_timestamp(xmin) > '2019-07-07 20:46:14.694288+10'; 

这个想法是对行进行增量和定期汇总.所以,

The idea is to do a rollup on rows incrementally and regularly. So,

-- 跟踪上次汇总的时间戳.-- 等待 5 分钟(或其他)-- 查找当前最大提交时间戳.-- 搜索提交时间戳在最后处理的时间戳和最大值之间的行.-- 把它们卷起来.

-- Track the last rolled up timestamp. -- Wait for 5 minutes (or whatever.) -- Find the current max commit timestamp. -- Search for rows where the commit timestamp is between the last processed timestamp and the max. -- Roll them up.

单独的事务 ID 是行不通的,因为它们很容易乱序提交.这个时间戳系统不一定是 100% 完美的,但我的目标是非常接近完美.因此,对于重叠的开始/结束时间,有些时钟摆动甚至有些混乱是可以容忍的.

Transaction IDs alone can't work because they can commit out of order very easily. And this timestamp system doesn't have to be 100% perfect, but I'm aiming for something very close to perfect. So, a bit of clock wiggle and even a bit of confusion around overlapping start/end times is likely tolerable.

这个计划有什么明显的缺陷吗?

Is there a glaring flaw in this plan?

推荐答案

由于这个主题在档案中似乎很少出现,我想在继续之前添加一些细节.我在几个列表、论坛和直接交流中询问了相关问题.有几个人很友好地审查了源代码,提供了历史背景,并为我澄清了这一点.希望在这里留下一些细节会帮助其他人走上正轨.错误都是我的,显然,更正和改进更受欢迎.

As this subject doesn't seem to show up in the archives very much, I want to add a bit of detail before moving on. I asked related questions on several lists, forums, and by direct communication. Several people were kind enough to review the source code, provide historical background, and clear this up for me. Hopefully, leaving some detail here will help someone else down the track. Errors are all mine, obviously, corrections and enhancements more than welcome.

  • 提交时间戳是在事务工作完成时分配的,但这与提交时不同.WAL 编写器不会更新戳记以保持它们按时间顺序排列.

  • Commit timestamps are assigned when the transaction's work is completed, but that's not the same was when it is committed. The WAL writer doesn't update the stamps to keep them in chronological sequence.

因此,提交时间戳绝对不是按顺序查找更改行的可靠机制.

Therefore, commit timestamps are definitely not a reliable mechanism for finding changes rows in order.

多个时钟.自动调整时钟.哦,人类!

Multiple clocks. Self-adjusting clocks. Oh the humanity!

如果您确实需要按顺序更改的序列,则可以选择逻辑解码或复制.(几周前我通过实验尝试了逻辑复制.最酷.事情.有史以来.)

If you do want an in order-change sequence, logical decoding or replication are options. (I tried out logical replication a couple of weeks ago experimentally. Coolest. Thing. Ever.)

时间戳跟踪的成本是每个事务 12 个字节,而不是每行.所以,没那么糟糕.(时间戳为 8 个字节,事务 ID 为 4 个字节.)

The cost of timestamp tracking is 12 bytes per transaction, not per row. So, not so bad. (Timestamps are 8 bytes, transaction IDs are 4 bytes.)

这都是现有交易系统的一部分,因此交易 ID 回滚的现实也适用于此.(在我的情况下并不可怕.)见:

This is all part of the existing transaction system, so the realities of transaction ID rollaround apply here too. (Not scary in my case.) See:

https://www.postgresql.org/docs/current/routine-吸尘.html

作为记录,您可以通过参数组设置在 RDS 上启用此选项.只需将 track_commit_timestamp 设置为 1 并重新启动.(在 postgres.conf 中设置为on".)

For the record, you can enable this option on RDS via a parameter group setting. Just set track_commit_timestamp to 1 and restart. (The setting is 'on' in an postgres.conf.)

这篇关于关于 Postgres track_commit_timestamp (pg_xact_commit_timestamp) 的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆