在表中指定许多行 [英] Specifying many rows in a table

查看:71
本文介绍了在表中指定许多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张大桌子(可能是数十或数亿行)和

我需要提取一些这些行,由一个整数初级定义

关键。


所以,明显的答案是


select * from table where id in(1,3,4);


但是我可能想要提取大量的行,数千个


select * from table where id in(1,3,5,... ...,100000,100017,23000);


当它超过10,000的最大表达深度时会失败。

我有一种鬼鬼祟祟的感觉增加max_expr_depth不是

正确修复。


性能非常重要,所以有人有个好建议

关于如何对这个查询进行短语,以便它最终只能通过mondo表进行单个

seq扫描,并做出合理的选择

是否使用索引扫描或seq扫描,具体取决于r的数量

我要退出了吗?


干杯,

史蒂夫


------- --------------------(广播结束)------------------------- -

提示9:如果您的

加入列的数据类型不匹配,计划员将忽略您选择索引扫描的愿望

I have a large table (potentially tens or hundreds of millions of rows) and
I need to extract some number of these rows, defined by an integer primary
key.

So, the obvious answer is

select * from table where id in (1,3,4);

But I may want to extract a large number of rows, many thousands

select * from table where id in (1, 3, 5, ...., 100000, 100017, 23000);

This falls over when it exceeds the maximum expression depth of 10,000.
And I have a sneaky feeling that increasing max_expr_depth isn''t the
right fix.

Performance is pretty important, so does anyone have a good suggestion
for how to phrase this query so that it''ll at worst only be a single
seq-scan through the mondo table, and make a reasonable choice as to
whether to use an index-scan or seq-scan, depending on the number
of rows I''m pulling out?

Cheers,
Steve

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column''s datatypes do not match

推荐答案

2004年1月28日星期三,Steve Atkins写道:
On Wed, 28 Jan 2004, Steve Atkins wrote:
我有一张大桌子(可能是几十或数以亿计的行和
我需要提取一些由行整数主键定义的行。

所以,显而易见的答案是

select * from table where id in(1,3,4);

但我可能想要提取大量的行,数千个

选择*来自表格的表格(1,3,5,....,100000,100017,23000);


如果范围是连续的,则介于两者之间。我有一种感觉

他们不是,但是看看你的例子。

当它超过10,000的最大表达深度时,它就会失败。
我有一种偷偷摸摸的感觉,即增加max_expr_depth并不是正确的解决方法。
I have a large table (potentially tens or hundreds of millions of rows) and
I need to extract some number of these rows, defined by an integer primary
key.

So, the obvious answer is

select * from table where id in (1,3,4);

But I may want to extract a large number of rows, many thousands

select * from table where id in (1, 3, 5, ...., 100000, 100017, 23000);
If the ranges are sequential, then between would work. I have a feeling
that they aren''t though, looking at your examples.
This falls over when it exceeds the maximum expression depth of 10,000.
And I have a sneaky feeling that increasing max_expr_depth isn''t the
right fix.




仅对7.4分支中的in()查询进行优化真的工作

当有一个subselect / table in。你可以尝试将这些

数字插入临时表并进行选择。


---------------------------(播出结束)------------ ---------------

提示3:如果通过Usenet发布/阅读,请发送适当的

subscribe-nomail命令 ma*******@postgresql.org ,以便您的

消息可以干净地进入邮件清单



The optimizations made for in() queries in the 7.4 branch only really work
when there''s a subselect / table in the in. You could try inserting those
numbers into a temp table and subselecting it.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


2004年1月28日星期三01:15:27 PM -0700,scott.marlowe写道:
On Wed, Jan 28, 2004 at 01:15:27PM -0700, scott.marlowe wrote:
2004年1月28日星期三,Steve Atkins写道:
On Wed, 28 Jan 2004, Steve Atkins wrote:
select * from table where id in(1,3,5,....,100000,100017,23000 );
select * from table where id in (1, 3, 5, ...., 100000, 100017, 23000);


当它超过10,000的最大表达深度时会失败。
我有一种偷偷摸摸的感觉,增加max_expr_depth不是<正确修复。
This falls over when it exceeds the maximum expression depth of 10,000.
And I have a sneaky feeling that increasing max_expr_depth isn''t the
right fix.



对于7.4分支中的in()查询所做的优化只有在其中有一个subselect / table时才真正起作用。你可以尝试将这些
数字插入到临时表中并进行选择。



The optimizations made for in() queries in the 7.4 branch only really work
when there''s a subselect / table in the in. You could try inserting those
numbers into a temp table and subselecting it.




我认为使用带有如此大的不同集合的IN会吸引性能 -

明智的,因为它必须采用缓慢的旧方法。

我可能会弄错。如果IN(SELECT ...)不起作用,我会尝试

重写查询以使用EXISTS从temp

表中提取值...


-

Alvaro Herrera(< alvherre [a] dcc.uchile.cl>)

"在你出生之前,你的父母并不像现在这样乏味。他们这样付出你的账单,清理你的房间并听取你的意见

告诉他们你是多么理想主义。 - Charles J. Sykes对青少年的建议


--------------------------- (广播结束)---------------------------

提示6:您是否搜索了我们的列表档案?

http://archives.postgresql.org



I think using IN with such a big distinct set would suck performance-
wise, because it would have to resort to the old method which was slow.
I could be mistaken though. If IN (SELECT...) doesn''t work, I''d try
rewriting the query to use an EXISTS pulling the values from a temp
table instead ...

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Before you were born your parents weren''t as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are." -- Charles J. Sykes'' advice to teenagers

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


如果你可以插入in将值放入一个临时表然后加入

对你的* large *表你可能会发现你获得了性能,

必须用你的'和'填充临时表的费用;在"值。


而不是必须填充和删除你的in值为

下一个查询做好准备,考虑在表格中添加一个搜索计数器列,例如


(伪代码)

创建临时表INTABLE(searchid整数,invalue整数);


创建一个函数InsertInValues(searchid integer,values text)

采用逗号分隔列表在...中的值,并使用循环将

插入您的(临时)INTABLE。


现在您的查询是


select * from table t,intable it

其中t.keyvalue = it.invalue

和searchid = XXXX;


下一个查询,再次使用

调用InsertInValues(searchid integer,values text)下一组数据,增加函数

调用的searchid值,然后执行下一个搜索。


或者编写一个函数SearchUsingInValues(invalue integer),它在一个函数中执行两步以上的

(生成自己的内部

searchid),并返回一组值。


或者取消searchid的东西,然后删除/重新创建临时

表在调用之间,或截断数据。


其他人可能能够建议更好/更快的方式来生成

"价值......


我不知道这是否有用。希望如此。


John Sidney-Woollett


Steve Atkins说:
If you could insert the "in" values into a temporary table then join
against your *large* table you may find that you getter performance, at
the expense of having to populate the temp tables with your "in" values.

Rather than having to populate and delete your "in" values ready for the
next query, consider adding a search counter column to the table, eg

(in pseudo code)
create temporary table INTABLE (searchid integer, invalue integer);

create a function InsertInValues(searchid integer, values text) which
takes a comma delimited list of "in" values, and uses a loop to insert
them into your (temp) INTABLE.

now your query is

select * from table t, intable it
where t.keyvalue = it.invalue
and searchid = XXXX;

Next query, call InsertInValues(searchid integer, values text) again with
the next set of data, incrementing the searchid value for the function
call, and then perform the next search.

Or write a function SearchUsingInValues(invalue integer) which does the
two steps above within the one function (generating its own internal
searchid), and returns a setof values.

Or do away with the searchid stuff, and either drop/recreate the temp
table between calls, or truncate the data.

Someone else may be able to suggest a better/faster way of generating the
"in" values...

I don''t know if that''s any use. Hope so.

John Sidney-Woollett

Steve Atkins said:
我有一张大桌子(可能数十亿或数亿行

我需要提取一些这些行,由一个整数主键定义。

所以,显而易见的答案是

select * from table where id in(1,3,4);

但我可能想要提取大量的行,数千个

从表中选择*,其中id为(1,3,5,....,100000,100017,23000);

当超过最大表达深度10,000.
我有一种偷偷摸摸的感觉,增加max_expr_depth并不是正确的解决方案。

性能非常重要,所以有人有一个很好的建议如何对这个查询进行短语,以便最后只通过mondo表进行单一的seq-scan扫描,并做出合理的选择,以确定是否使用索引扫描或者seq-scan,取决于我要抽出的行数?

干杯,
史蒂夫

----- ----------------------(播出结束)----------------------- ----
提示9:如果您的
加入列的数据类型不匹配,计划者将忽略您选择索引扫描的愿望
I have a large table (potentially tens or hundreds of millions of rows)
and
I need to extract some number of these rows, defined by an integer primary
key.

So, the obvious answer is

select * from table where id in (1,3,4);

But I may want to extract a large number of rows, many thousands

select * from table where id in (1, 3, 5, ...., 100000, 100017, 23000);

This falls over when it exceeds the maximum expression depth of 10,000.
And I have a sneaky feeling that increasing max_expr_depth isn''t the
right fix.

Performance is pretty important, so does anyone have a good suggestion
for how to phrase this query so that it''ll at worst only be a single
seq-scan through the mondo table, and make a reasonable choice as to
whether to use an index-scan or seq-scan, depending on the number
of rows I''m pulling out?

Cheers,
Steve

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column''s datatypes do not match



- --------------------------(广播结束)------------------- --------

提示3:如果通过Usenet发布/阅读,请发送适当的

subscribe-nomail命令给 ma ******* @ postgresql.org ,以便您的

消息可以干净地通过邮件列表


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


这篇关于在表中指定许多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆