巨大的数据 [英] Huge Data

查看:106
本文介绍了巨大的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




我使用PostgreSQL 7.4来存储大量数据。例如7

百万行。但是当我运行查询select table(*)from table;时,它会在大约120秒后产生
。这样一个巨大的

表的结果是否正常?有没有办法加快查询时间?巨大的

表有整数主键和其他列的其他索引。

硬件是:PIII 800 MHz处理器,512 MB RAM和IDE硬盘

驱动器。


-sezai


------------ ---------------(播出结束)---------------------------

提示8:解释分析是你的朋友

Hi,

I use PostgreSQL 7.4 for storing huge amount of data. For example 7
million rows. But when I run the query "select count(*) from table;", it
results after about 120 seconds. Is this result normal for such a huge
table? Is there any methods for speed up the querying time? The huge
table has integer primary key and some other indexes for other columns.

The hardware is: PIII 800 MHz processor, 512 MB RAM, and IDE hard disk
drive.

-sezai

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

推荐答案

2004年1月14日星期三11:11,Sezai YILMAZ写道:
On Wednesday 14 January 2004 11:11, Sezai YILMAZ wrote:


我使用PostgreSQL 7.4来存储大量数据。例如7行/百万行。但是当我运行查询select table(*)from table;时,它会在大约120秒后产生结果。对于如此巨大的表格,这个结果是否正常?有没有办法加快查询时间?巨大的
表有整数主键和其他列的其他索引。
Hi,

I use PostgreSQL 7.4 for storing huge amount of data. For example 7
million rows. But when I run the query "select count(*) from table;", it
results after about 120 seconds. Is this result normal for such a huge
table? Is there any methods for speed up the querying time? The huge
table has integer primary key and some other indexes for other columns.




PG使用MVCC来管理并发。这样做的一个缺点是要验证表格中确切的行数,你必须全部访问它们。


这里有很多在档案馆,也可能是常见问题解答。


你用什么计数()?


-

Richard Huxton

Archonet Ltd


---------------------- -----(广播结束)---------------------------

提示5:你有没有?检查了我们广泛的常见问题解答?

http:/ /www.postgresql.org/docs/faqs/FAQ.html



PG uses MVCC to manage concurrency. A downside of this is that to verify the
exact number of rows in a table you have to visit them all.

There''s plenty on this in the archives, and probably the FAQ too.

What are you using the count() for?

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


Richard Huxton写道:
Richard Huxton wrote:
2004年1月14日星期三11:11,Sezai YILMAZ写道:

On Wednesday 14 January 2004 11:11, Sezai YILMAZ wrote:



我使用PostgreSQL 7.4存储大量金额数据的。例如7行/百万行。但是当我运行查询select table(*)from table;时,它会在大约120秒后产生结果。对于如此巨大的表格,这个结果是否正常?有没有办法加快查询时间?巨大的
表有整数主键和其他列的其他一些索引。
Hi,

I use PostgreSQL 7.4 for storing huge amount of data. For example 7
million rows. But when I run the query "select count(*) from table;", it
results after about 120 seconds. Is this result normal for such a huge
table? Is there any methods for speed up the querying time? The huge
table has integer primary key and some other indexes for other columns.


PG使用MVCC来管理并发。这样做的一个缺点是要验证表格中确切的行数,你必须全部访问它们。

档案中有很多行,可能还有常见问题解答你使用count()是为了什么?



PG uses MVCC to manage concurrency. A downside of this is that to verify the
exact number of rows in a table you have to visit them all.

There''s plenty on this in the archives, and probably the FAQ too.

What are you using the count() for?



我使用count()来获取一些统计数据。只是为了显示到目前为止收集了多少记录




-sezai


------ ---------------------(广播结束)------------------------ ---

提示9:如果您的

加入列的数据类型不匹配,计划员将忽略您选择索引扫描的愿望


I use count() for some statistics. Just to show how many records
collected so far.

-sezai

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column''s datatypes do not match


Richard Huxton写道:
Richard Huxton wrote:
PG使用MVCC来管理并发。这样做的一个缺点是要验证表格中确切的行数,你必须全部访问它们。

档案中有很多行,可能还有常见问题解答你在使用count()是什么?
PG uses MVCC to manage concurrency. A downside of this is that to verify the
exact number of rows in a table you have to visit them all.

There''s plenty on this in the archives, and probably the FAQ too.

What are you using the count() for?




选择logid,agentid,logbody from log where logid = 3000000;


此查询也会在大约120秒后返回。表日志有大约

700万条记录,logid是日志表的主键。

怎么样?为什么太慢?


-sezai

---------------------- -----(广播结束)---------------------------

提示9:规划师如果您的

加入列的数据类型不匹配,将忽略您选择索引扫描的愿望



select logid, agentid, logbody from log where logid=3000000;

this query also returns after about 120 seconds. The table log has about
7 million records, and logid is the primary key of log table. What about
that? Why is it too slow?

-sezai
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column''s datatypes do not match


这篇关于巨大的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆