PostgreSQL插入速度测试 [英] PostgreSQL insert speed tests

查看:120
本文介绍了PostgreSQL插入速度测试的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好


插入PostgreSQL时我需要高吞吐量。因此我做了一些PostgreSQL插件性能测试。


------------------- -----------------------------------------

- - 测试架构

创建表日志(

logid串行主键,

ctime整数不为空,

stime整数非空,

itime整数不为空,

agentid整数不为空,

subagentid整数不为空,

ownerid integer not null,

hostid integer not null,

appname varchar(64)default null,

logbody varchar(1024)not null

);


使用btree(ctime)在日志上创建索引ctime_ndx;

使用btree(stime)在日志上创建索引stime_ndx );

使用btree(itime)在日志上创建索引itime_ndx;

使用hash(agentid)在日志上创建索引agentid_ndx;

创建索引ownerid_ndx on logs使用hash(ownerid);

使用hash(hostid)在日志上创建索引hostid_ndx;

--------------------------------------------- ---------------


测试硬件:

IBM Thinkpad R40

CPU :Pentium 4 Mobile 1993 Mhz(全功率)

内存:512 MB

操作系统:GNU / Linux,Fedora Core 1,内核2.4.24


使用libpq开发的测试程序在表中插入200.000行

logs。插入是每个事务100行(总共2.000

交易)。


postgresql.conf文件中的一些参数更改如下:

------------------------------------------------ ----------------

shared_buffers = 2048 #min max_connections * 2或16,每个8KB

max_fsm_relations = 20000#min 10,fsm是自由空间地图,~40个字节

max_fsm_pages = 200000#min 1000,fsm是自由空间地图,~6个字节

max_locks_per_transaction = 256#min 10

wal_buffers = 64#min 4,每个通常8KB

sort_mem = 32768 #min 64,大小以KB为单位
checkpoint_segments = 6#in logfile segments,min 1,16MB

checkpoint_timeout = 900#range 30-3600,in seconds

fsync = true

wal_sync_method = fsync#默认值在pl之间变化告知:

enable_seqscan = true

enable_indexscan = true

enable_tidscan = true

enable_sort = true

enable_nestloop = true

enable_mergejoin = true

enable_hashjoin = true

effective_cache_size = 2000#通常每个8KB

geqo = true

geqo_selection_bias = 2.0#range 1.5-2.0

geqo_threshold = 11

geqo_pool_size = 0#default基于声明中的表,

#range 128-1024

geqo_effort = 1

geqo_generations = 0

geqo_random_seed = -1#auto - 计算种子

--------------------------------------- -------------------------


使用PostgreSQL 7.3.4和PostgreSQL进行测试7.4.1(

测试程序在版本更改期间重新编译)。


结果如下(每秒平均插入行数)。


速度为spe ed for

现有记录的数量PostgreSQL 7.3.4 PostgreSQL 7.4.1

==================== ============================== ==================== ===


0初始记录1086行/秒1324行/秒

200.000初始记录781行/秒893行/秒

400.000初始记录576行/秒213行/秒

600.000初始记录419行/秒200行/秒

800.000初始记录408行/秒未测试因为坏了

结果

当只用一个索引(主键)重建日志表时,

达到2941行/秒的速度。但我需要所有七个索引。


问题是为什么PostgreSQL 7.4.1在繁重的工作中如此缓慢?


有没有一种加速插入而不消除索引的方法?


如何将并发插入(并发备用测试程序执行)

放到同一个表中?它没用。


-sezai


------------------- --------(广播结束)---------------------------

提示9 :如果您的

加入列的数据类型不匹配,计划程序将忽略您选择索引扫描的愿望

Hello

I need high throughput while inserting into PostgreSQL. Because of that I
did some PostgreSQL insert performance tests.

------------------------------------------------------------
-- Test schema
create table logs (
logid serial primary key,
ctime integer not null,
stime integer not null,
itime integer not null,
agentid integer not null,
subagentid integer not null,
ownerid integer not null,
hostid integer not null,
appname varchar(64) default null,
logbody varchar(1024) not null
);

create index ctime_ndx on logs using btree (ctime);
create index stime_ndx on logs using btree (stime);
create index itime_ndx on logs using btree (itime);
create index agentid_ndx on logs using hash (agentid);
create index ownerid_ndx on logs using hash (ownerid);
create index hostid_ndx on logs using hash (hostid);
------------------------------------------------------------

Test Hardware:
IBM Thinkpad R40
CPU: Pentium 4 Mobile 1993 Mhz (full powered)
RAM: 512 MB
OS: GNU/Linux, Fedora Core 1, kernel 2.4.24

A test program developed with libpq inserts 200.000 rows into table
logs. Insertions are made with 100 row per transaction (total 2.000
transactions).

Some parameter changes from postgresql.conf file follows:
----------------------------------------------------------------
shared_buffers = 2048 # min max_connections*2 or 16, 8KB each
max_fsm_relations = 20000 # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 200000 # min 1000, fsm is free space map, ~6 bytes
max_locks_per_transaction = 256 # min 10
wal_buffers = 64 # min 4, typically 8KB each
sort_mem = 32768 # min 64, size in KB
vacuum_mem = 16384 # min 1024, size in KB
checkpoint_segments = 6 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 900 # range 30-3600, in seconds
fsync = true
wal_sync_method = fsync # the default varies across platforms:
enable_seqscan = true
enable_indexscan = true
enable_tidscan = true
enable_sort = true
enable_nestloop = true
enable_mergejoin = true
enable_hashjoin = true
effective_cache_size = 2000 # typically 8KB each
geqo = true
geqo_selection_bias = 2.0 # range 1.5-2.0
geqo_threshold = 11
geqo_pool_size = 0 # default based on tables in statement,
# range 128-1024
geqo_effort = 1
geqo_generations = 0
geqo_random_seed = -1 # auto-compute seed
----------------------------------------------------------------

The test was made with both of PostgreSQL 7.3.4 and PostgreSQL 7.4.1 (the
test program was recompiled during version changes).

The results are below (average inserted rows per second).

speed for speed for
# of EXISTING RECORDS PostgreSQL 7.3.4 PostgreSQL 7.4.1
================================================== =======================

0 initial records 1086 rows/s 1324 rows/s
200.000 initial records 781 rows/s 893 rows/s
400.000 initial records 576 rows/s 213 rows/s
600.000 initial records 419 rows/s 200 rows/s
800.000 initial records 408 rows/s not tested because of bad
results
When the logs table reconstructed with only one index (primary key) then
2941 rows/s speed is reached. But I need all the seven indexes.

The question is why the PostgreSQL 7.4.1 is so slow under heavy work?

Is there a way to speed up inserts without eliminating indexes?

What about concurrent inserts (cocurrent spare test program execution)
into the same table? It did not work.

-sezai

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column''s datatypes do not match

推荐答案

Sezai YILMAZ写道:
Sezai YILMAZ wrote:
测试硬件:
IBM Thinkpad R40
CPU:Pentium 4 Mobile 1993 Mhz(全功率)
RAM:512 MB
操作系统:GNU / Linux,Fedora Core 1,内核2.4.24
使用libpq开发的测试程序将200.000行插入表
日志中。插入是每个事务100行(总共2.000
事务)。

postgresql.conf文件中的一些参数更改如下:
---------- -------------------------------------------------- ----
shared_buffers = 2048 #min max_connections * 2或16,8KB每个


我建议你说10000缓冲区..

max_fsm_relations = 20000#min 10,fsm是自由空间地图,~40个字节
max_fsm_pages = 200000#min 1000,fsm是自由空间地图,~6个字节
max_locks_per_transaction = 256#min 10
wal_buffers = 64#min 4,通常每个8KB
sort_mem = 32768#min 64,大小以KB为单位


我猜你需要把它拉下来一点。 8/16MB怎么样?

vacuum_mem = 16384#min 1024,大小以KB


不需要。因为您正在测试插入,所以可以完成1024。

当然,只有当你运行真空时才重要..

effective_cache_size = 2000#通常每次8KB


这是真的吗?它告诉postgresql它有大约16MB的内存。将它设置为15000左右的
,以便使用大约100MB +。可能会更改索引的结果

扫描..我总是喜欢将它设置为任何可用的。

测试是使用PostgreSQL 7.3.4和PostgreSQL 7.4进行的。 1(
测试程序在版本更改期间重新编译)。

结果如下(平均每秒插入行数)。

速度为
现有记录的数量PostgreSQL 7.3.4 PostgreSQL 7.4.1
================================ ================== =======================

0初始记录1086行/秒1324行/秒
200.000初始记录781行/秒893行/秒
400.000初始记录576行/秒213行/秒
600.000初始记录419行/ 200行/秒
800.000初始记录408行/秒未测试因为结果不好
结果


你的意思是800000?我相信''''在这里是一个千位分隔符,但不是太多

肯定.. :-)

当只用一个索引(主键)重建日志表时)然后
达到2941行/秒的速度。但我需要所有七个索引。

问题是为什么PostgreSQL 7.4.1在繁重的工作中如此之慢?


你能运行vmstat并查看停滞的地方吗?也许你可以提高

的WAL段数并尝试。

有没有办法在不删除索引的情况下加速插入?

怎么样?并发插入(并发备用测试程序执行)进入同一个表?它不起作用。
Test Hardware:
IBM Thinkpad R40
CPU: Pentium 4 Mobile 1993 Mhz (full powered)
RAM: 512 MB
OS: GNU/Linux, Fedora Core 1, kernel 2.4.24

A test program developed with libpq inserts 200.000 rows into table
logs. Insertions are made with 100 row per transaction (total 2.000
transactions).

Some parameter changes from postgresql.conf file follows:
----------------------------------------------------------------
shared_buffers = 2048 # min max_connections*2 or 16, 8KB each
I suggest you up that to say 10000 buffers..
max_fsm_relations = 20000 # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 200000 # min 1000, fsm is free space map, ~6 bytes
max_locks_per_transaction = 256 # min 10
wal_buffers = 64 # min 4, typically 8KB each
sort_mem = 32768 # min 64, size in KB
You need to pull it down a little, I guess. How about 8/16MB?
vacuum_mem = 16384 # min 1024, size in KB
Not required. 1024 could be done since you are testing inserts anyways. Of
course, it matters only when you run vacuum..
effective_cache_size = 2000 # typically 8KB each
Is that true? It tells postgresql that it has around 16MB memory. Set it up
around 15000 so that around 100MB+ is used. Might change the results of index
scans.. I always prefer to set it to whatever available.
The test was made with both of PostgreSQL 7.3.4 and PostgreSQL 7.4.1 (the
test program was recompiled during version changes).

The results are below (average inserted rows per second).

speed for speed for
# of EXISTING RECORDS PostgreSQL 7.3.4 PostgreSQL 7.4.1
================================================== =======================

0 initial records 1086 rows/s 1324 rows/s
200.000 initial records 781 rows/s 893 rows/s
400.000 initial records 576 rows/s 213 rows/s
600.000 initial records 419 rows/s 200 rows/s
800.000 initial records 408 rows/s not tested because of bad
results
Do you mean 800000? I believe the ''.'' is a thousands separator here but not too
sure..:-)
When the logs table reconstructed with only one index (primary key) then
2941 rows/s speed is reached. But I need all the seven indexes.

The question is why the PostgreSQL 7.4.1 is so slow under heavy work?
Can you run vmstat and see where things get stalled? Probably you can up the
number of WAL segments and attempt.
Is there a way to speed up inserts without eliminating indexes?

What about concurrent inserts (cocurrent spare test program execution)
into the same table? It did not work.




这是什么意思,它不起作用?有什么错误吗?


HTH


Shridhar


--------- ------------------(广播结束)---------------------------

提示1:订阅和取消订阅命令转到 ma*******@postgresql.org



What does it mean, it didn''t work? Any errors?

HTH

Shridhar

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org


我不知道为什么7.4比较慢的问题的答案,但我有

关于要测试的其他东西的一些建议,以及如何使它更快。


首先,尝试每次1000条记录的200笔交易,你甚至可能需要

尝试每次10,000条记录的20笔交易。 Postgres似乎运行得更快

你提交的提交越少,但不同的配置可能会改变

甜蜜点。


其次,你的问题的一个可能的解决方案是删除索引,

插入新行并重新创建索引。当然,为了进行测试,你需要计算掉/插入/创建的整个过程,并将它与原始插入时间进行比较,并将索引保​​持原样。我在我的数据库上使用存储过程

,即:


选择drop_foo_indexes();

....

<将多行插入表foo的命令>

....

选择create_foo_indexes();


需要考虑的另一件事是真空吸尘器。你没有提到你在测试过程中经常用数据库吸尘的频率,我会建议你在每次测试之间使用真空充满

(当然,除非你重新测试缺少多少

真空会伤害性能;)


希望这会有所帮助。


Sezai YILMAZ写道:
I don''t know the answer to the question of why 7.4 is slower, but I have
some suggestions on additional things to test, and how to make it faster.

First off, try 200 transactions of 1000 records each, you might even want
to try 20 transactions of 10,000 records each. Postgres seems to run much
faster the less commits you have, but different configs may change the
sweet spot.

Secondly, one possible solution to your problem is to drop the indexes,
insert the new rows and recreate the indexes. Of course, for testing,
you''ll want to time the entire process of drop/insert/create and compare
it to the raw insert time with indexes intact. I use a stored procedure
on my databases, i.e.:

select drop_foo_indexes();
....
<commands to insert many rows into table foo>
....
select create_foo_indexes();

Another thing to consider is vacuums. You don''t mention how often you
vacuumed the database during testing, I would recommend a "vacuum full"
between each test (unless, of course, you''re testing how much a lack
of vacuum hurts performance ;)

Hope this helps.

Sezai YILMAZ wrote:
Hello

插入PostgreSQL时我需要高吞吐量。因此我做了一些PostgreSQL插入性能测试。

--------------------------- ---------------------------------
- 测试架构
创建表日志(
logid序列主键,
ctime整数不为null,
stime integer not null,
itime integer not null,
agentid integer not null,
subagentid integer not null,
ownerid integer not null,
hostid integer not null,
appname varchar(64)default null,
logbody varchar(1024)not null
);

使用btree(ctime)在日志上创建索引ctime_ndx;
使用btree(stime)在日志上创建索引stime_ndx;
使用btree(itime)在日志上创建索引itime_ndx; <使用hash(agentid)在日志上创建索引agentid_ndx;使用hash(ownerid)在日志上创建索引ownerid_ndx;
使用hash(hostid)在日志上创建索引hostid_ndx;
- ------------------------------------------- ---------------

测试硬件:
IBM Thinkpad R40
CPU:Pentium 4 Mobile 1993 Mhz(全功率)操作系统:GNU / Linux,Fedora Core 1,内核2.4.24
使用libpq开发的测试程序将200.000行插入表
日志中。插入是每个事务100行(总共2.000
事务)。

postgresql.conf文件中的一些参数更改如下:
---------- -------------------------------------------------- ----
shared_buffers = 2048 #min max_connections * 2或16,8KB每个
max_fsm_relations = 20000#min 10,fsm是自由空间地图,~40字节
max_fsm_pages = 200000#min 1000,fsm是自由空间地图,~6个字节
max_locks_per_transaction = 256#min 10
wal_buffers = 64#min 4,每个通常8KB
sort_mem = 32768 #min 64,大小以KB为单位
vacuum_mem = 16384#min 1024,大小以KB为单位
checkpoint_segments = 6#in logfile segments,min 1,16MB每个
checkpoint_timeout = 900#range 30-3600,以秒为单位
fsync = true
wal_sync_method = fsync#默认值因平台而异:
enable_seqscan = true
enable_indexscan = true enable_tidscan = true
enable_sort = true
enable_nestloop = true
enable_mergejoin = true
enable_hashjoin = true
effective_cache_size = 2000#通常每个8KB
geqo = true
geqo_selection_bias = 2.0#range 1.5-2.0
geqo_threshold = 11
geqo_pool_size = 0#默认基于声明中的表格,
#range 128-1024
geqo_effort = 1
geqo_generations = 0
geqo_random_seed = -1#auto-compute seed
-------------------------- --------------------------------------

测试完了同时使用PostgreSQL 7.3.4和PostgreSQL 7.4.1(
测试程序在版本更改期间重新编译)。

结果如下(每秒平均插入行数)。

速度的速度
#现有记录PostgreSQL 7.3.4 PostgreSQL 7.4.1
===================== ====================== ======= =======================
0初始记录1086行/秒1324行/秒
200.000初始记录781行/秒893行/ s
400.000初始记录576行/秒213行/秒
600.000初始记录419行/秒200行/秒
800.000初始值记录408行/秒未测试因为结果不好

当仅使用一个索引(主键)重建日志表时,则达到2941行/秒的速度。但我需要所有七个索引。

问题是为什么PostgreSQL 7.4.1在繁重的工作中如此缓慢?

有没有办法加速插入而不消除索引?

如何将并发插入(并发备用测试程序执行)放到同一个表中?它不起作用。
Hello

I need high throughput while inserting into PostgreSQL. Because of that I
did some PostgreSQL insert performance tests.

------------------------------------------------------------
-- Test schema
create table logs (
logid serial primary key,
ctime integer not null,
stime integer not null,
itime integer not null,
agentid integer not null,
subagentid integer not null,
ownerid integer not null,
hostid integer not null,
appname varchar(64) default null,
logbody varchar(1024) not null
);

create index ctime_ndx on logs using btree (ctime);
create index stime_ndx on logs using btree (stime);
create index itime_ndx on logs using btree (itime);
create index agentid_ndx on logs using hash (agentid);
create index ownerid_ndx on logs using hash (ownerid);
create index hostid_ndx on logs using hash (hostid);
------------------------------------------------------------

Test Hardware:
IBM Thinkpad R40
CPU: Pentium 4 Mobile 1993 Mhz (full powered)
RAM: 512 MB
OS: GNU/Linux, Fedora Core 1, kernel 2.4.24

A test program developed with libpq inserts 200.000 rows into table
logs. Insertions are made with 100 row per transaction (total 2.000
transactions).

Some parameter changes from postgresql.conf file follows:
----------------------------------------------------------------
shared_buffers = 2048 # min max_connections*2 or 16, 8KB each
max_fsm_relations = 20000 # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 200000 # min 1000, fsm is free space map, ~6 bytes
max_locks_per_transaction = 256 # min 10
wal_buffers = 64 # min 4, typically 8KB each
sort_mem = 32768 # min 64, size in KB
vacuum_mem = 16384 # min 1024, size in KB
checkpoint_segments = 6 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 900 # range 30-3600, in seconds
fsync = true
wal_sync_method = fsync # the default varies across platforms:
enable_seqscan = true
enable_indexscan = true
enable_tidscan = true
enable_sort = true
enable_nestloop = true
enable_mergejoin = true
enable_hashjoin = true
effective_cache_size = 2000 # typically 8KB each
geqo = true
geqo_selection_bias = 2.0 # range 1.5-2.0
geqo_threshold = 11
geqo_pool_size = 0 # default based on tables in statement,
# range 128-1024
geqo_effort = 1
geqo_generations = 0
geqo_random_seed = -1 # auto-compute seed
----------------------------------------------------------------

The test was made with both of PostgreSQL 7.3.4 and PostgreSQL 7.4.1 (the
test program was recompiled during version changes).

The results are below (average inserted rows per second).

speed for speed for
# of EXISTING RECORDS PostgreSQL 7.3.4 PostgreSQL 7.4.1
================================================== =======================

0 initial records 1086 rows/s 1324 rows/s
200.000 initial records 781 rows/s 893 rows/s
400.000 initial records 576 rows/s 213 rows/s
600.000 initial records 419 rows/s 200 rows/s
800.000 initial records 408 rows/s not tested because of bad
results
When the logs table reconstructed with only one index (primary key) then
2941 rows/s speed is reached. But I need all the seven indexes.

The question is why the PostgreSQL 7.4.1 is so slow under heavy work?

Is there a way to speed up inserts without eliminating indexes?

What about concurrent inserts (cocurrent spare test program execution)
into the same table? It did not work.




-

Bill Moran

潜在技术
http://www.potentialtech.com

------ ---------------------(广播结束)------------------------ ---

提示3:如果通过Usenet发布/阅读,请发送适当的

subscribe-nomail命令给 ma ******* @ postgresql.org ,以便您的

消息可以干净地通过邮件列表



--
Bill Moran
Potential Technologies
http://www.potentialtech.com
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


Sezai YILMAZ写道:
Sezai YILMAZ wrote:
使用hash(agentid)在日志上创建索引agentid_ndx;
在日志上创建索引ownerid_ndx使用hash(ownerid);
使用hash(hostid)在日志上创建索引hostid_ndx;
------------------------- -----------------------------------
速度为
现有的速度记录 PostgreSQL 7.3.4 PostgreSQL 7.4.1
====================================== ============ =======================
0初始记录1086行/秒1324行/秒
200.000初始记录781行/秒893行/ s
400.000初始记录576行/秒213行/秒
600.000初始记录419行/秒200行/秒
800.000初始记录408行/秒未测试因为结果不好
create index agentid_ndx on logs using hash (agentid);
create index ownerid_ndx on logs using hash (ownerid);
create index hostid_ndx on logs using hash (hostid);
------------------------------------------------------------
speed for speed for
# of EXISTING RECORDS PostgreSQL 7.3.4 PostgreSQL 7.4.1
================================================== =======================

0 initial records 1086 rows/s 1324 rows/s
200.000 initial records 781 rows/s 893 rows/s
400.000 initial records 576 rows/s 213 rows/s
600.000 initial records 419 rows/s 200 rows/s
800.000 initial records 408 rows/s not tested because of bad
results




我将三个哈希索引更改为btree。


性能增加了大约2倍(在PostgreSQL 7.3.4 1905

行/秒)。


并发插入现在可以工作。


更改的索引更适合哈希类型。因为,没有

对它们进行排序,而是精确值匹配,这对于散列类型的索引更为自然

。但哈希索引可能会在多个并发插入时出现死锁问题。我想我可以忍受btree

索引。他们工作得更好。 :-)


-sezai


--------------------- ------(广播结束)---------------------------

提示6:有您搜索了我们的列表档案?

http://archives.postgresql.org



I changed the three hash indexes to btree.

The performance is increased about 2 times (in PostgreSQL 7.3.4 1905
rows/s).

Concurrent inserts now work.

Changed indexes are more suitable for hash type. Because, there is no
ordering on them, instead exact values are matched which is more natural
for hash type of indexes. But hash indexes has possible dead lock
problems on multiple concurrent inserts. I think I can live with btree
indexes. They work better. :-)

-sezai

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


这篇关于PostgreSQL插入速度测试的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆