查询性能慢 [英] slow query performance

查看:62
本文介绍了查询性能慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在使用概念上简单的
数据库时遇到了严重的性能问题。该数据库有一个表,包含天气观测。

该表目前有大约1350万行,并且正在不断更新

。 (该数据库运行在512MHz RAM的双550MHz PIII上。

我在i686-pc-linux-gnu上有PostgreSQL 7.1.3,由GCC编译2.96

on RedHat 7.2)


总的来说,查询形式如下:


SELECT? FROM obs WHERE station =?

AND valid_time< ? AND valid_time> ?

或:

SELECT? FROM obs WHERE station IN(?,?,...)

AND valid_time< ? AND valid_time> ?


像这样的查询每次花费大约4到5分钟,这对我来说似乎过于缓慢(或者我的期望过于乐观了吗?)。


例如:

SELECT站,air_temp FROM obs

WHERE station =''EGBB''

AND valid_time> ''28 / 8/03 00:00''

AND valid_time< ''28 / 10/03 00:00''


需要4分钟32秒。


以上查询的解释说:

注意:查询计划:


在obs上使用obs_pkey进行索引扫描(成本= 0.00..9.01行= 1宽度= 20)


一个简单的来自obs的SELECT count(*)查询需要花费大约一段时间




我已经运行了vacuumdb --analyze obs,收效甚微。


如何加快速度?我哪里错了?

表结构或索引是否有问题?是否持续更新

数据库(以每秒1-3个条目之间的速率)导致

问题?


表和索引定义如下:

表obs

属性|输入|修饰符

---------------------------- + ------------ -------------- + ----------

valid_time |带时区的时间戳|

metar_air_temp |双精度|

relative_humidity |双精度|

pressure_change |双精度|

上限|双精度|

metar_dew_point |双精度|

metar_gusts |双精度|

wet_bulb_temperature |双精度|

past_weather |文字|

公开性|双精度|

metar_visibility |双精度|

降水|双精度|

站|字符(10)|

pressure_msl |双精度|

metar_min_temperature_6hr |双精度|

precipitation_period |双精度|

metar_wet_bulb |双精度|

saturation_mixing_ratio |双精度|

metar_pressure |双精度|

metar_sky_cover |文字|

露点|双精度|

wind_direction |双精度|

actual_time |带时区的时间戳|

gust_speed |双精度|

high_cloud_type |文字|

precipitation_24hr |双精度|

metar_precipitation_24hr |双精度|

pressure_tendency |文字|

metar_relative_humidity |双精度|

low_cloud_type |文字|

metar_max_temperature_6hr |双精度|

middle_cloud_type |文字|

air_temp |双精度|

low_and_middle_cloud_cover |文字|

metar_wind_dir |双精度|

metar_weather |文字|

snow_depth |双精度|

metar_snow_depth |双精度|

min_temp_12hr |双精度|

present_weather |文字|

wind_speed |双精度|

snow_cover |文字|

metar_wind_speed |双精度|

metar_ceiling |双精度|

max_temp_12hr |双精度|

mixing_ratio |双精度|

pressure_change_3hr |双精度|

total_cloud |整数|

max_temp_24hr |双精度|

min_temp_24hr |双精度|

snow_amount_6hr |双精度|

指数:obs_pkey,

obs_station,

obs_valid_time


指数" obs_pkey" ;

属性|输入

------------ + --------------------------

valid_time |带时区的时间戳

station |字符(10)

唯一btree


索引" obs_station"

属性|输入

----------- + ---------------

station |字符(10)

btree


索引" obs_valid_time"

属性|输入

------------ + --------------------------

valid_time |带时区的时间戳

btree


(我怀疑obs_valid_time索引是多余的,因为

obs_pkey索引 - 是对吗?)


我会很感激任何建议和任何线索来帮助加快这一点。

非常感谢,

Dave


---------------------------(播出结束)--- ------------------------

提示3:如果通过Usenet发布/阅读,请发送相应的

subscribe-nomail命令 ma*******@postgresql.org 以便

消息可以干净利落地到达邮件列表

I''m having severe performance issues with a conceptually simple
database. The database has one table, containing weather observations.
The table currently has about 13.5 million rows, and is being updated
constantly. (The database is running on a dual 550MHz PIII with 512MB RAM.
I have PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC 2.96
on RedHat 7.2)

On the whole, queries are of the form:

SELECT ? FROM obs WHERE station = ?
AND valid_time < ? AND valid_time > ?
or:
SELECT ? FROM obs WHERE station IN (?, ?, ...)
AND valid_time < ? AND valid_time > ?

Queries like these are taking around 4 to 5 minutes each, which seems
excessively slow to me (or are my expectations far too optimistic?).

For instance:
SELECT station, air_temp FROM obs
WHERE station = ''EGBB''
AND valid_time > ''28/8/03 00:00''
AND valid_time < ''28/10/03 00:00''

takes 4 mins 32 secs.

An EXPLAIN of the above query says:
NOTICE: QUERY PLAN:

Index Scan using obs_pkey on obs (cost=0.00..9.01 rows=1 width=20)

A simple "SELECT count(*) from obs" query takes around that sort of time
too.

I have run "vacuumdb --analyze obs", to little effect.

How can I speed this up? Where am I going wrong? Is there a problem with
the table structure, or the indexes? Does the continual updating of the
database (at the rate of somewhere between 1-3 entries per second) cause
problems?

The table and indexes are defined as follows:
Table "obs"
Attribute | Type | Modifier
----------------------------+--------------------------+----------
valid_time | timestamp with time zone |
metar_air_temp | double precision |
relative_humidity | double precision |
pressure_change | double precision |
ceiling | double precision |
metar_dew_point | double precision |
metar_gusts | double precision |
wet_bulb_temperature | double precision |
past_weather | text |
visibility | double precision |
metar_visibility | double precision |
precipitation | double precision |
station | character(10) |
pressure_msl | double precision |
metar_min_temperature_6hr | double precision |
precipitation_period | double precision |
metar_wet_bulb | double precision |
saturation_mixing_ratio | double precision |
metar_pressure | double precision |
metar_sky_cover | text |
dew_point | double precision |
wind_direction | double precision |
actual_time | timestamp with time zone |
gust_speed | double precision |
high_cloud_type | text |
precipitation_24hr | double precision |
metar_precipitation_24hr | double precision |
pressure_tendency | text |
metar_relative_humidity | double precision |
low_cloud_type | text |
metar_max_temperature_6hr | double precision |
middle_cloud_type | text |
air_temp | double precision |
low_and_middle_cloud_cover | text |
metar_wind_dir | double precision |
metar_weather | text |
snow_depth | double precision |
metar_snow_depth | double precision |
min_temp_12hr | double precision |
present_weather | text |
wind_speed | double precision |
snow_cover | text |
metar_wind_speed | double precision |
metar_ceiling | double precision |
max_temp_12hr | double precision |
mixing_ratio | double precision |
pressure_change_3hr | double precision |
total_cloud | integer |
max_temp_24hr | double precision |
min_temp_24hr | double precision |
snow_amount_6hr | double precision |
Indices: obs_pkey,
obs_station,
obs_valid_time

Index "obs_pkey"
Attribute | Type
------------+--------------------------
valid_time | timestamp with time zone
station | character(10)
unique btree

Index "obs_station"
Attribute | Type
-----------+---------------
station | character(10)
btree

Index "obs_valid_time"
Attribute | Type
------------+--------------------------
valid_time | timestamp with time zone
btree

(I suspect the obs_valid_time index is redundant, because of the
obs_pkey index - is that right?)

I''d be grateful for any advice and any clues to help speed this up.
Many thanks,
Dave

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

推荐答案

Dave Weaver写道:

过于乐观?)。
Dave Weaver wrote:
too optimistic?).

例如:
SELECT station,air_temp FROM obs
WHERE station =''EGBB''
AND valid_time> ''28 / 8/03 00:00''
AND valid_time< ''28 / 10/03 00:00''

需要4分32秒。

以上查询的解释说:
通知:查询计划:

在obs上使用obs_pkey进行索引扫描(成本= 0.00..9.01行= 1宽度= 20)

一个简单的来自obs的SELECT count(*)查询需要花费那么多时间。


这不是测试。 Postgresql将始终获取整个表。
我已经运行vacuumdb --analyze obs,效果不大。

For instance:
SELECT station, air_temp FROM obs
WHERE station = ''EGBB''
AND valid_time > ''28/8/03 00:00''
AND valid_time < ''28/10/03 00:00''

takes 4 mins 32 secs.

An EXPLAIN of the above query says:
NOTICE: QUERY PLAN:

Index Scan using obs_pkey on obs (cost=0.00..9.01 rows=1 width=20)

A simple "SELECT count(*) from obs" query takes around that sort of time
too.
That is no test. Postgresql will always fetch the entire table.
I have run "vacuumdb --analyze obs", to little effect.




检查 http://www.varlena.com/varlena/Gener...bits/perf .html for general

调整提示,如果表经常更新,请填充真空。真空后重新索引

指数。这需要相当长的时间。


看看是否有帮助。


Shridhar

---- -----------------------(播出结束)---------------------- -----

提示8:解释分析是你的朋友



Check http://www.varlena.com/varlena/Gener...bits/perf.html for general
tuning tips and do a vacuum full if table is updated frequently. Also reindex
the indexes after vacuum. It will take quite some time though.

See if that helps.

Shridhar
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


例如:
SELECT station,air_temp FROM obs
WHERE station =''EGBB''
AND valid_time> ''28 / 8/03 00:00''
AND valid_time< ''28 / 10/03 00:00''

需要4分钟32秒。
For instance:
SELECT station, air_temp FROM obs
WHERE station = ''EGBB''
AND valid_time > ''28/8/03 00:00''
AND valid_time < ''28/10/03 00:00''

takes 4 mins 32 secs.




应该多少行返回?

[解释分析会告诉你]


虽然运行是你的磁盘颠簸? vmstat的bi / bo列将

告诉你。


7.1已经很老了,但我确实理解了迁移到7.3 [或

..4beta]带有巨大的数据库


-

Jeff Trout< je ** @ jefftrout.com> ;
http://www.jefftrout.com/
http://www.stuarthamm.net/


---------------------------(广播结束)--------------- ------------

提示9:如果您的

加入列的数据类型,规划人员将忽略您选择索引扫描的愿望不匹配



How many rows should that return?
[explain analyze will tell you that]

and while that runs is your disk thrashing? vmstat''s bi/bo columns will
tell you.

7.1 is quite old, but I do understand the pain of migraing to 7.3 [or
..4beta] with huge db''s

--
Jeff Trout <je**@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column''s datatypes do not match


Jeff写道:
Dave Weaver写道:
Dave Weaver wrote:
例如:
SELECT station,air_temp FROM obs
WHERE station =''EGBB''
AND valid_time> ''28 / 8/03 00:00''
AND valid_time< ''28 / 10/03 00:00''

需要4分32秒。
应返回多少行?
[解析分析会告诉你]
For instance:
SELECT station, air_temp FROM obs
WHERE station = ''EGBB''
AND valid_time > ''28/8/03 00:00''
AND valid_time < ''28/10/03 00:00''

takes 4 mins 32 secs.
How many rows should that return?
[explain analyze will tell you that]




" explain analyze"似乎不是这个postgres版本的一部分

(或者我误解了一些东西)。

特定查询返回了24行。


虽然运行是你的磁盘颠簸? vmstat'的bi / bo专栏将告诉你。


机器在建筑物的另一侧,所以我不能实际上

看看磁盘是否正在颠簸。

我不知道如何解释vmstat输出;运行vmstat 1在我开始查询之前,显示

我bi / bo都为零(大部分)。然后bi将
提高到2500左右(bo保持在零左右),直到查询结束。


7.1已经很老了,但我确实理解了用巨大的数据库移动到7.3 [或
.4beta]



"explain analyze" doesn''t seem to be part of this postgres version
(or I misunderstood something).
That particular query returned 24 rows.

and while that runs is your disk thrashing? vmstat''s bi/bo columns will
tell you.
The machine''s over the other side of the building, so I can''t physically
see if the disk is thrashing.
I''m not sure how to interpret the vmstat output; running "vmstat 1" shows
me bi/bo both at zero (mostly) until I start the query. Then bi shoots up
to around 2500 (bo remains around zero) until the query finishes.

7.1 is quite old, but I do understand the pain of migraing to 7.3 [or
.4beta] with huge db''s




升级是否有可能产生影响?

我仍然没有明白我的问题是由于:

1 Postgres版本

2数据库大小

3表结构

4配置问题

5慢速硬件

6以上所有

7没有以上

8别的


感谢您的帮助,

Dave。


---------------------------(广播结束)----------------- ----------

提示6:您是否搜索了我们的列表档案?

http://archives.postgresql.org



Is the upgrade likely to make a difference?
I''m still none-the-wiser wether the problem I have is due to:
1 Postgres version
2 Database size
3 Table structure
4 Configuration issues
5 Slow hardware
6 All of the above
7 None of the above
8 Something else

Thanks for the help,
Dave.

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


这篇关于查询性能慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆