如何查找上次更新PostgreSQL数据库的时间? [英] How do I find the last time that a PostgreSQL database has been updated?

查看:1021
本文介绍了如何查找上次更新PostgreSQL数据库的时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一个可批量更新的postgreSQL数据库.我需要知道最后一次更新或修改数据库(或数据库中的表)的时间是什么时候.

I am working with a postgreSQL database that gets updated in batches. I need to know when the last time that the database (or a table in the database)has been updated or modified, either will do.

我看到在postgeSQL论坛上有人建议使用日志记录并查询您的日志.由于我无法控制客户端代码库,因此这对我不起作用.

I saw that someone on the postgeSQL forum had suggested that to use logging and query your logs for the time. This will not work for me as that I do not have control over the clients codebase.

推荐答案

您可以编写

You can write a trigger to run every time an insert/update is made on a particular table. The common usage is to set a "created" or "last_updated" column of the row to the current time, but you could also update the time in a central location if you don't want to change the existing tables.

例如,一种典型的方法是以下方法:

So for example a typical way is the following one:

CREATE FUNCTION stamp_updated() RETURNS TRIGGER LANGUAGE 'plpgsql' AS $$
BEGIN
  NEW.last_updated := now();
  RETURN NEW;
END
$$;
-- repeat for each table you need to track:
ALTER TABLE sometable ADD COLUMN last_updated TIMESTAMP;
CREATE TRIGGER sometable_stamp_updated
  BEFORE INSERT OR UPDATE ON sometable
  FOR EACH ROW EXECUTE PROCEDURE stamp_updated();

然后要找到最近的更新时间,您需要从要跟踪的每个表中选择"MAX(last_updated)",并充分利用其中的最大值,例如:

Then to find the last update time, you need to select "MAX(last_updated)" from each table you are tracking and take the greatest of those, e.g.:

SELECT MAX(max_last_updated) FROM (
  SELECT MAX(last_updated) AS max_last_updated FROM sometable
  UNION ALL
  SELECT MAX(last_updated) FROM someothertable
) updates

对于具有序列(或类似生成的)主键的表,可以尝试避免使用主键索引进行顺序扫描以查找最新的更新时间,或者在last_updated上创建索引.

For tables with a serial (or similarly-generated) primary key, you can try avoid the sequential scan to find the latest update time by using the primary key index, or you create indices on last_updated.

-- get timestamp of row with highest id
SELECT last_updated FROM sometable ORDER BY sometable_id DESC LIMIT 1

请注意,如果ID的顺序不是很连续,这可能会产生一些错误的结果,但是您需要多少精度? (请记住,事务意味着您可以以与创建行不同的顺序看到行.)

Note that this can give slightly wrong results in the case of IDs not being quite sequential, but how much accuracy do you need? (Bear in mind that transactions mean that rows can become visible to you in a different order to them being created.)

避免在每个表中添加已更新"列的另一种方法是使用中央表来存储更新时间戳.例如:

An alternative approach to avoid adding 'updated' columns to each table is to have a central table to store update timestamps in. For example:

CREATE TABLE update_log(table_name text PRIMARY KEY, updated timestamp NOT NULL DEFAULT now());
CREATE FUNCTION stamp_update_log() RETURNS TRIGGER LANGUAGE 'plpgsql' AS $$
BEGIN
  INSERT INTO update_log(table_name) VALUES(TG_TABLE_NAME);
  RETURN NEW;
END
$$;
-- Repeat for each table you need to track:
CREATE TRIGGER sometable_stamp_update_log
 AFTER INSERT OR UPDATE ON sometable
 FOR EACH STATEMENT EXECUTE stamp_update_log();

这将为您提供一个表格,该表格在每次更新表格时都带有一行:您可以执行以下操作:

This will give you a table with a row for each table update: you can then just do:

SELECT MAX(updated) FROM update_log

获取上次更新时间. (如果需要,可以按表将其拆分).该表当然会保持增长:要么在"updated"上创建索引(这应该使获取最新索引很快),要么在适合您的用例的情况下定期截断它(例如,对表进行排他锁,获取最新的更新时间,然后如果需要定期检查是否已进行更改,则将其截断).

To get the last update time. (You could split this out by table if you wanted). This table will of course just keep growing: either create an index on 'updated' (which should make getting the latest one pretty fast) or truncate it periodically if that fits with your use case, (e.g. take an exclusive lock on the table, get the latest update time, then truncate it if you need to periodically check if changes have been made).

另一种方法(可能是论坛上的人们的意思)是在数据库配置中设置"log_statement = mod"(对于群集是全局的,或者对于您需要跟踪的数据库或用户),然后所有修改数据库的语句都将写入服务器日志.然后,您需要在数据库外写一些东西来扫描服务器日志,过滤掉您不感兴趣的表,等等.

An alternative approach- which might be what the folks on the forum meant- is to set 'log_statement = mod' in the database configuration (either globally for the cluster, or on the database or user you need to track) and then all statements that modify the database will be written to the server log. You'll then need to write something outside the database to scan the server log, filtering out tables you aren't interested in, etc.

这篇关于如何查找上次更新PostgreSQL数据库的时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆