30行几行的表格-截断清空它们并重置附加序列的最快方法吗? [英] 30 tables with few rows - TRUNCATE the fastest way to empty them and reset attached sequences?

查看:111
本文介绍了30行几行的表格-截断清空它们并重置附加序列的最快方法吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道,在PostgreSQL中完成此类任务的最快方法是什么。我对最快的解决方案感兴趣。



我发现自己是一种针对MySQL的解决方案,它的执行速度远不只是一个一个表的截断。但是无论如何,我也对MySQL最快的解决方案感兴趣。在这里查看我的结果,当然它仅适用于MySQL: https://github.com/bmabey / database_cleaner / issues / 126



我有以下假设:




  • 我有30-100张桌子。假设它们为30。

  • 表的一半是空的。

  • 每个非空表的行数都不超过100。我的意思是说,表不是很大。

  • 我需要一种可选的可能性,以便从该过程中排除2个,5个或N个表。


  • 我不能!使用事务。




对于需要在PostgreSQL 8和9上运行的这种情况,我需要最快的清理策略。

>

我看到以下方法:


  1. 截断每个表。我认为这太慢了,尤其是对于空表。


  2. 通过更快的方法检查每个表是否为空,然后如果它为空,则重置它唯一标识符列(在MySQL中为AUTO_INCREMENT的类似物)为初始状态(1),即将其last_value从序列恢复为1,否则对其进行截断。


我使用Ruby代码遍历所有表,在每个表上调用下面的代码,我尝试设置针对每个表运行的SQL代码,例如:

  DO $$ DECLARE r记录; 
开始
somehow_captured =从#{table} _id_seq
中选择last_value IF(somehow_captured == 1)THEN
==在此处恢复初始唯一标识符列的值==
END

IF(以某种方式捕获> 1)然后
TRUNCATE TABLE#{table};
END IF;
END $$;

在各个方面操纵此代码,由于我不熟悉,因此无法正常工作PostgreSQL函数和块(和变量)。



我的猜测是,EXISTS(从表中选择内容)可以以某种方式用作检查过程之一单元,清理过程应该包括但还没有完成。



我很感谢有关如何以PostgreSQL本地方式完成此过程的任何提示。 / p>

更新:



我需要所有这些来运行Ruby或Ruby on Rails项目的单元和集成测试。每个测试在运行之前应该有一个干净的数据库,或者在其自身之后进行清理(所谓的拆卸)。事务非常好,但是在针对特定的Webdriver运行测试时,它们变得不可用,在我的情况下,需要切换到截断策略。一旦我更新了有关RoR的内容,请不要在此处发布有关显然,您需要PG的DatabaseCleaner的答案,依此类推,以此类推。



UPDATE 2 :



最近在这里描述的策略已合并到DatabaseCleaner中, https://github.com/bmabey/database_cleaner 作为:pre_count选项(请参见自述文件)。

解决方案

如果有人对当前策略感兴趣,请使用此方法,请参阅基于Ruby的仓库 https:/ /github.com/stanislaw/truncate-vs-count 用于MySQL和PostgreSQL。



我的结果:



MySQL:清理数据库最快的策略是对截断进行以下修改:

 如果表不为空
截断。
否则
,如果AUTO_INCREMENT不为0
截断。
结束
结束




  • 对于MySQL只是截断比删除要快得多。 DELETE胜过TRUNCATE的唯一情况是在空表上执行。

  • 对于带空检查的MySQL截断,比仅多次截断要快得多。

  • 对于使用空检查的MySQL删除,比仅删除每个表上的删除要快得多。



PostgreSQL:清理数据库最快的策略是删除具有与MySQL相同的空检查,但依赖于currval:

 如果表不为空
删除表
否则
如果当前值不为0
删除表
结束
结束




  • 对于PostgreSQL而言,删除操作比仅TRUNCATION(偶数)要快得多。

  • 对于PostgreSQL而言,多个TRUNCATE做空之前的检查比仅多个TRUNCATE快一点

  • 对于带空检查的PostgreSQL删除比仅删除PostgreSQL快一点。



这是从哪里来的开始: https://github.com/bmabey/database_cleaner/issues/126



这是结果代码,并经过长时间讨论: https ://github.com/bmabey/database_cleaner/pull/127



这是有关pgsql性能邮件列表的讨论: http://archives.postgresql.org/pgsql-performance/2012-07/msg00047.php



我们开始收集用户反馈,证明我的想法是首先检查空表是正确的。


I wonder, what is the fastest way to accomplish this kind of task in PostgreSQL. I am interested in the fastest solutions ever possible.

I found myself such kind of solution for MySQL, it performs much faster than just truncation of tables one by one. But anyway, I am interested in the fastest solutions for MySQL too. See my result here, of course it it for MySQL only: https://github.com/bmabey/database_cleaner/issues/126

I have following assumptions:

  • I have 30-100 tables. Let them be 30.
  • Half of the tables are empty.
  • Each non-empty table has, say, no more than 100 rows. By this I mean, tables are NOT large.
  • I need an optional possibility to exclude 2 or 5 or N tables from this procedure.

  • I cannot! use transactions.

I need the fastest cleaning strategy for such case working on PostgreSQL both 8 and 9.

I see the following approaches:

  1. Truncate each table. It is too slow, I think, especially for empty tables.

  2. Check each table for emptiness by more faster method, and then if it is empty, reset its unique identifier column (analog of AUTO_INCREMENT in MySQL) to initial state (1), i.e to restore its last_value from sequence back to 1, otherwise run truncate on it.

I use Ruby code to iterate through all tables, calling code below on each of them, I tried to setup SQL code running against each table like:

DO $$DECLARE r record;
BEGIN
  somehow_captured = SELECT last_value from #{table}_id_seq
  IF (somehow_captured == 1) THEN
    == restore initial unique identifier column value here ==
  END

  IF (somehow_captured > 1) THEN
    TRUNCATE TABLE #{table};
  END IF;
END$$;

Manipulating this code in various aspects, I couldn't make it work, because of I am unfamiliar with PostgreSQL functions and blocks (and variables).

Also my guess was that EXISTS(SELECT something FROM TABLE) could somehow be used to work good as one of the "check procedure" units, cleaning procedure should consist of, but haven't accomplished it too.

I would appreciate any hints on how this procedure could be accomplished in PostgreSQL native way.

UPDATE:

I need all this to run unit and integration tests for Ruby or Ruby on Rails projects. Each test should have a clean DB before it runs, or to do a cleanup after itself (so called teardown). Transactions are very good, but they become unusable when running tests against particular webdrivers, in my case the switch to truncation strategy is needed. Once I updated that with reference to RoR, please do not post here the answers about "Obviously, you need DatabaseCleaner for PG" and so on and so on.

UPDATE 2:

The strategy described here recently was merged into DatabaseCleaner, https://github.com/bmabey/database_cleaner as :pre_count option (see README there).

解决方案

If someone is interested with the current strategy, I use for this, see this Ruby-based repo https://github.com/stanislaw/truncate-vs-count for both MySQL and PostgreSQL.

My results:

MySQL: the fastest strategy for cleaning databases is truncation with following modifications:

if table is not empty
  truncate. 
else 
  if AUTO_INCREMENT is not 0
    truncate.
  end
end

  • For MySQL just truncation is much faster than just deletion. The only case where DELETE wins over TRUNCATE is doing it on empty table.
  • For MySQL truncation with empty checks is much faster than just multiple truncation.
  • For MySQL deletion with empty checks is much faster than just DELETE on each tables.

PostgreSQL: The fastest strategy for cleaning databases is deletion with the same empty-checks as for MySQL, but with relying on currval instead:

if table is not empty
  delete table
else 
  if currval is not 0
    delete table
  end
end

  • For PostgreSQL just deletion is much faster than just TRUNCATION(even multiple).
  • For PostgreSQL multiple TRUNCATE doing empty checks before is slightly faster than just multiple TRUNCATE
  • For PostgreSQL deletion with empty checks is slightly faster than just PostgreSQL deletion.

This is from where it began: https://github.com/bmabey/database_cleaner/issues/126

This is the result code and long discussion: https://github.com/bmabey/database_cleaner/pull/127

This is discussion on pgsql-performance mailing list: http://archives.postgresql.org/pgsql-performance/2012-07/msg00047.php

We began collecting users feedback proving my idea with first checking empty tables is right.

这篇关于30行几行的表格-截断清空它们并重置附加序列的最快方法吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆