PostgreSQL临时表 [英] PostgreSQL temporary tables

查看:266
本文介绍了PostgreSQL临时表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要执行250万次查询.此查询生成我需要AVG(column)的一些行,然后使用此AVG从低于平均值的所有值中过滤表.然后,我需要将这些过滤后的结果INSERT放入表中.

I need to perform a query 2.5 million times. This query generates some rows which I need to AVG(column) and then use this AVG to filter the table from all values below average. I then need to INSERT these filtered results into a table.

以合理的效率执行此操作的唯一方法似乎是通过为每个query-postmaster python-thread创建一个TEMPORARY TABLE.我只是希望这些TEMPORARY TABLE不会被永久保存到硬盘驱动器中,并且会保留在内存(RAM)中,除非它们不在工作内存中.

The only way to do such a thing with reasonable efficiency, seems to be by creating a TEMPORARY TABLE for each query-postmaster python-thread. I am just hoping these TEMPORARY TABLEs will not be persisted to hard drive (at all) and will remain in memory (RAM), unless they are out of working memory, of course.

我想知道TEMPORARY TABLE是否会引起磁盘写操作(这会干扰INSERTS,即使整个过程变慢)

I would like to know if a TEMPORARY TABLE will incur disk writes (which would interfere with the INSERTS, i.e. slow to whole process down)

推荐答案

请注意,在Postgres中,临时表的默认行为是不会自动删除它们,并且在提交时会保留数据.参见 ON COMMIT .

Please note that, in Postgres, the default behaviour for temporary tables is that they are not automatically dropped, and data is persisted on commit. See ON COMMIT.

临时表被删除,数据库会话:

临时表在会话结束时自动删除,或者 (可选)在当前交易结束时.

Temporary tables are automatically dropped at the end of a session, or optionally at the end of the current transaction.

您必须考虑多个因素:

  • If you do want to explicitly DROP a temporary table at the end of a transaction, create it with the CREATE TEMPORARY TABLE ... ON COMMIT DROP syntax.
  • In the presence of connection pooling, a database session may span multiple client sessions; to avoid clashes in CREATE, you should drop your temporary tables -- either prior to returning a connection to the pool (e.g. by doing everything inside a transaction and using the ON COMMIT DROP creation syntax), or on an as-needed basis (by preceding any CREATE TEMPORARY TABLE statement with a corresponding DROP TABLE IF EXISTS, which has the advantage of also working outside transactions e.g. if the connection is used in auto-commit mode.)
  • While the temporary table is in use, how much of it will fit in memory before overflowing on to disk? See the temp_buffers option in postgresql.conf
  • Anything else I should worry about when working often with temp tables? A vacuum is recommended after you have DROPped temporary tables, to clean up any dead tuples from the catalog. Postgres will automatically vacuum every 3 minutes or so for you when using the default settings (auto_vacuum).

此外,与您的问题无关(但可能与您的项目有关):请记住,如果必须在填充后的临时表 上运行查询,则它是一个创建适当的索引并在要插入的临时表 之后在临时表上发出ANALYZE的好主意.默认情况下,基于成本的优化器将假定新创建的临时表具有约1000行,如果临时表实际包含数百万行,这可能会导致性能下降.

Also, unrelated to your question (but possibly related to your project): keep in mind that, if you have to run queries against a temp table after you have populated it, then it is a good idea to create appropriate indices and issue an ANALYZE on the temp table in question after you're done inserting into it. By default, the cost based optimizer will assume that a newly created the temp table has ~1000 rows and this may result in poor performance should the temp table actually contain millions of rows.

这篇关于PostgreSQL临时表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆