Linux上的性能问题 [英] Performance problem on Linux

查看:73
本文介绍了Linux上的性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




我有一个相对简单的SQL:


从TABLE中选择FK,其中up​​per(A)喜欢''B% ''和upper(C)喜欢''D%''

我们在Linux和Windows 2003上安装了DB2 UDB v7.1 FP 12


在使用优化级别5以及9和0的Linux上,SQL使用

3''100''000''000 timerons!

在Windows 2003上SQL需要26''000个计时器


我们已经使用SUSE Linux 9.1和Redhat 7.2进行了测试,两者都具有相同的结果。不,零的数量是正确的。


Table Table有140''000行。 SQL的结果集返回200

行。


任何想法可能导致这种差异。访问计划是相同的(表 - > TBSACAN - > RESULT)但是一个是可以接受的,另一个不是。


Runstats已被执行两个表格。


我们看到的另一个问题是,当我们在A列添加索引时,

访问计划变得完全不同。 DB2将在
TABLE上执行索引扫描,然后执行FETCH并将索引扫描的结果与

TABLE(!)连接,然后执行TBSCAN然后返回结果。这个

案例中的FETCH需要3''000''000'000个定时器。


如上所述,我们使用优化级别0测试, 5和9.

比较Windows和Linux上的优化SQL表明,无论是在Windows上还是在Linux上都不需要优化
优化。


有什么想法吗?


问候


Rudolf Bargholz

Hi,

I have a ralatively simple SQL:

select FK from TABLE where upper(A) like ''B%'' and upper(C) like ''D%''

We have DB2 UDB v7.1 FP 12 installed on Linux and on Windows 2003

On Linux using optimization level 5 as well as 9 and 0 the SQL uses
3''100''000''000 timerons !
On Windows 2003 the SQL needs 26''000 timerons

We have tested with SUSE Linux 9.1 as well as Redhat 7.2 , both with the
same dramtic results. And no, the number of zeros is correct.

The Table TABLE has 140''000 rows. The result set of the SQL returns 200
rows.

Any idea what could be causing this difference. The access plans are
identical (TABLE -> TBSACAN -> RESULT) but one is acceptable, the other not.

Runstats have been executed on both tables.

Another problem we have seen is, when we add an index on column A, the
access plan becomes totally different. DB2 will perform an index scan on
TABLE, then performs a FETCH and joins the result of the index scan with
TABLE (!), then does a TBSCAN and then returns the RESULT. The FETCH in this
case takes 3''000''000''000 timerons.

As I mentioned above, we tested using optimization levels 0, 5 and 9.
Comparing the optimized SQL on Windows and Linux showed, that no
optimization was necessary, neither on Windows, nor on Linux.

Any ideas?

Regards

Rudolf Bargholz

推荐答案

Timerons是一个任意的度量单位。我不相信它们可以跨平台比较。


你没有提到查询是否已经在平台上执行了

比较结果执行时间。我怀疑它们是否可以比较,假设类似的硬件并且忽略了由不同操作系统引起的可能的差异




您描述的访问路径是预期的。将

函数(UPPER)应用于列将需要进行表扫描以进行评估。当

你有索引时,首先使用索引,因为它会导致更少的处理I / O
。下面的扫描可能是因为

优化器确定索引处理的结果将会b / b $ b检索到足够的表,扫描会比检索更快

表格的各个页面。


优化这一点的一种方法是:

1.在表格中添加一列,用

的第一个字符A和B各列,大写。这将需要插入和

更新触发器。 (仅在可以修改列内容时更新。)

2.在新的大写列上创建索引。

3.在新创建的索引上对表进行聚类。你将不得不重新格式化表格以使集群变得有效。

4.更改查询以使用新列名并合并两个数据

值为查询中的一对字母。

查询现在将使用新索引来定位

匹配的行范围。当

导致I / O减少时,顺序检测会将访问模式切换为预取。此技术将使用

索引和最小可能的I / O来检索您的数据行。


这不考虑使用导致的其他可能问题

新列作为聚类索引。表行将增加两个

字节/行(假设为NOT NULL),这将添加280,000字节的表。你会

也需要索引的空间。如果经常使用

查询,这可能是一个公平的权衡。解释在聚类之前和之后运行

reorg可用于确定是否需要聚类选项来确保
避免表扫描。


Soundex代码通常用于在

搜索时对名称提供一些选择性。您可以考虑使用它而不是单个字符来获得

选择性。


Phil Sherman



Rudolf Bargholz写道:
Timerons are an arbitrary unit of measure. I don''t believe that they are
comparable across platforms.

You didn''t mention if the queries had been executed on the platforms and
the resultant execution times compared. I suspect that they''d be
comparable, assuming similar hardware and ignoring possible differences
caused by the different operating systems.

The access paths you described are what would be expected. Applying a
function (UPPER) to a column will require a table scan to evaluate. When
you have the index, the index is used first because it will cause less
I/O to process. The following scan was probably caused because the
optimizer determined that the results of the index processing would
retrieve enough of the table that scan would be faster than retrieving
individual pages of the table.

One way to optimize this is:
1. Add a column to the table, populating it with the first character of
each of the columns A and B, uppercased. This will require insert and
update triggers. (Update only if the column content can be modified.)
2. Create an index on the new upper cased column.
3. Cluster the table on the newly created index. You''ll have to reorg
the table to make the clustering efffective.
4. Change the query to use the new column name and combine the two data
values to a pair of letters in the query.
The query will now use the new index to locate a range of rows that
match. Sequential detect will switch access mode to prefetch when that
results in less I/O. This technique will retrieve your data rows using
an index and the minimum possible I/O.

This does not take into account other possible issues caused by using
the new column as a clustering index. The table rows will grow by two
bytes/row (assuming NOT NULL) which will add 280,000 bytes table. You''ll
also need space for the index. This is probably a fair tradeoff if the
query is frequently used. Explains run before and after the clustering
reorg can be used to determine if the clustering option is necessary to
avoid the table scan.

Soundex code are often used to provide some selectivity on names when
searching. You might consider using it instead of a single character for
selectivity.

Phil Sherman


Rudolf Bargholz wrote:


我有一个相对简单的SQL:

从TABLE中选择FK,其中up​​per(A)like'' B%''和upper(C)喜欢''D%''
我们在Linux和Windows 2003上安装了DB2 UDB v7.1 FP 12

在Linux上使用优化级别5以及9和0 SQL使用3''100''000'000个计时器!
在Windows 2003上,SQL需要26''000个计时器

表格表有140''000行。 SQL的结果集返回200行。

任何想法可能导致这种差异。访问计划是相同的(表 - > TBSACAN - > RESULT)但是一个是可以接受的,另一个不是。

Runstats已在两个表上执行。
<我们看到的另一个问题是,当我们在A列上添加索引时,
访问计划变得完全不同。 DB2将在表上执行索引扫描,然后执行FETCH并将索引扫描的结果与表(!)连接,然后执行TBSCAN,然后返回RESULT。在这种情况下,FETCH需要3''000''000'000个定时器。

如上所述,我们使用优化级别0,5和9进行了测试。
比较Windows和Linux上的优化SQL表明,无论是在Windows上还是在Linux上都不需要进行优化。

任何想法?

关心

Rudolf Bargholz
Hi,

I have a ralatively simple SQL:

select FK from TABLE where upper(A) like ''B%'' and upper(C) like ''D%''

We have DB2 UDB v7.1 FP 12 installed on Linux and on Windows 2003

On Linux using optimization level 5 as well as 9 and 0 the SQL uses
3''100''000''000 timerons !
On Windows 2003 the SQL needs 26''000 timerons

We have tested with SUSE Linux 9.1 as well as Redhat 7.2 , both with the
same dramtic results. And no, the number of zeros is correct.

The Table TABLE has 140''000 rows. The result set of the SQL returns 200
rows.

Any idea what could be causing this difference. The access plans are
identical (TABLE -> TBSACAN -> RESULT) but one is acceptable, the other not.

Runstats have been executed on both tables.

Another problem we have seen is, when we add an index on column A, the
access plan becomes totally different. DB2 will perform an index scan on
TABLE, then performs a FETCH and joins the result of the index scan with
TABLE (!), then does a TBSCAN and then returns the RESULT. The FETCH in this
case takes 3''000''000''000 timerons.

As I mentioned above, we tested using optimization levels 0, 5 and 9.
Comparing the optimized SQL on Windows and Linux showed, that no
optimization was necessary, neither on Windows, nor on Linux.

Any ideas?

Regards

Rudolf Bargholz






嗨Phil,


谢谢回复。评论如下。


问候


Rudolf


" Philip Sherman" < PS ****** @ ameritech.net> schrieb im Newsbeitrag

新闻:QQ **************** @ newssvr28.news.prodigy.com ...
Hi Phil,

Thanks for the reply. Comments below.

Regards

Rudolf

"Philip Sherman" <ps******@ameritech.net> schrieb im Newsbeitrag
news:QQ****************@newssvr28.news.prodigy.com ...
Timerons是一种任意的度量单位。我不相信它们可以跨平台进行比较。


如果排队在Windows上花费不到一秒钟,但在Linux上花费30秒

到一分钟......


我之所以要问的原因是因为我们的一个客户在Linux上使用了DB2,因此Linux出现了极大的速度问题。我无法理解为什么他们之前没有就这些问题与我们联系过
。如果我不得不以他们习以为常的速度工作,我个人会去疯狂。


关于硬件,两台测试机非常相似,没什么

会向我解释访问计划和计时器的巨大差异。

您描述的访问路径是预期的。将
函数(UPPER)应用于列将需要进行表扫描以进行评估。当你有索引时,首先使用索引,因为它会导致更少的I / O处理。


为什么Windows不使用索引,但Linux会这样做?当桌面扫描不可避免时,我认为没有理由使用

索引扫描。

以下扫描可能是因为
优化器确定结果索引处理将检索足够的表,扫描比检索表的各个页面更快。


如果我将优化级别设置为
0或9,优化器难道不会有不同的反应吗?我尝试了0,5和9,并且访问计划没有改变。

优化这一点的一种方法是:
1.在表中添加一列,用第一个字符填充它
每列A和B,大写。这将需要插入和
更新触发器。 (仅在可以修改列内容时更新。)
2.在新的大写列上创建索引。
3.在新创建的索引上对表进行聚类。你将不得不重新组合表格以使聚类有效。
4.更改查询以使用新的列名称并将两个数据值组合成一对字母查询。


这是一个很酷的主意:-)谢谢!

查询现在将使用新索引来定位
匹配的行范围。顺序检测会将访问模式切换为预取,这样可以减少I / O.此技术将使用索引和最小可能的I / O来检索数据行。

这不考虑使用新列作为其他可能导致的问题。聚类索引。表行将增长两个
字节/行(假设为NOT NULL),这将增加280,000字节的表。你还需要索引的空间。如果经常使用
查询,这可能是一个公平的权衡。解释在聚类之前和之后运行
reorg可用于确定是否需要聚类选项以避免表扫描。

Soundex代码通常用于提供一些选择性在搜索时的名字。您可以考虑使用它而不是单个字符来选择性。

Phil Sherman


Rudolf Bargholz写道:
Timerons are an arbitrary unit of measure. I don''t believe that they are
comparable across platforms.
If queies the take a fraction of a second on Windows but take thirty seconds
to one minute on Linux ....

You didn''t mention if the queries had been executed on the platforms and
the resultant execution times compared. I suspect that they''d be
comparable, assuming similar hardware and ignoring possible differences
caused by the different operating systems.
The reason I am asking is because the one customer of ours that uses DB2 on
Linux is having dramatic speed problems. I cannot understand why they have
not contacted us earlier regarding these problems. I personally would go
crazy if I had to work at the speed they are used to.

Regarding the hardware, both test machines are very similar, nothing that
would explain to me the dramatic difference in access plan and timerons.

The access paths you described are what would be expected. Applying a
function (UPPER) to a column will require a table scan to evaluate. When
you have the index, the index is used first because it will cause less
I/O to process.
Why would Windows not use an index, but Linux does? I see no reason why an
index scan is used when a table scan is unavoidable.
The following scan was probably caused because the
optimizer determined that the results of the index processing would
retrieve enough of the table that scan would be faster than retrieving
individual pages of the table.
Shouldn''t the optimizer react differently if I set the optimization level to
0 or 9? I tried 0, 5 and 9 and the access plan did not change.

One way to optimize this is:
1. Add a column to the table, populating it with the first character of
each of the columns A and B, uppercased. This will require insert and
update triggers. (Update only if the column content can be modified.)
2. Create an index on the new upper cased column.
3. Cluster the table on the newly created index. You''ll have to reorg
the table to make the clustering efffective.
4. Change the query to use the new column name and combine the two data
values to a pair of letters in the query.
This is a cool idea :-) Thanks !

The query will now use the new index to locate a range of rows that
match. Sequential detect will switch access mode to prefetch when that
results in less I/O. This technique will retrieve your data rows using
an index and the minimum possible I/O.

This does not take into account other possible issues caused by using
the new column as a clustering index. The table rows will grow by two
bytes/row (assuming NOT NULL) which will add 280,000 bytes table. You''ll
also need space for the index. This is probably a fair tradeoff if the
query is frequently used. Explains run before and after the clustering
reorg can be used to determine if the clustering option is necessary to
avoid the table scan.

Soundex code are often used to provide some selectivity on names when
searching. You might consider using it instead of a single character for
selectivity.

Phil Sherman


Rudolf Bargholz wrote:


我有一个相对简单的SQL:

从TABLE中选择FK,其中up​​per(A)喜欢''B%''和upper(C)喜欢'' D%''

我们在Linux和Windows 2003上安装了DB2 UDB v7.1 FP 12

在Linux上使用优化级别5以及9和0 SQL使用
3''100''000'000计时器!
在Windows 2003上,SQL需要26''000个计时器

我们已经使用SUSE Linux 9.1进行了测试以及Redhat 7.2,两者都具有相同的dramtic结果。不,零的数量是正确的。

表格表有140''000行。 SQL的结果集返回200行。

任何想法可能导致这种差异。访问计划是相同的(TABLE - > TBSACAN - > RESULT)但一个是可以接受的,另一个是
没有。
Runstats已在两个表上执行。
<我们看到的另一个问题是,当我们在A列上添加索引时,
访问计划变得完全不同。 DB2将在表上执行索引扫描,然后执行FETCH并将索引扫描的结果与表(!)连接,然后执行TBSCAN,然后返回RESULT。这个案例中的
中的FETCH需要3''000''000'000个定时器。

如上所述,我们使用优化级别0,5和9进行了测试。
比较Windows和Linux上的优化SQL表明,无论是在Windows上还是在Linux上都不需要进行优化。

任何想法?

关心

Rudolf Bargholz
Hi,

I have a ralatively simple SQL:

select FK from TABLE where upper(A) like ''B%'' and upper(C) like ''D%''

We have DB2 UDB v7.1 FP 12 installed on Linux and on Windows 2003

On Linux using optimization level 5 as well as 9 and 0 the SQL uses
3''100''000''000 timerons !
On Windows 2003 the SQL needs 26''000 timerons

We have tested with SUSE Linux 9.1 as well as Redhat 7.2 , both with the
same dramtic results. And no, the number of zeros is correct.

The Table TABLE has 140''000 rows. The result set of the SQL returns 200
rows.

Any idea what could be causing this difference. The access plans are
identical (TABLE -> TBSACAN -> RESULT) but one is acceptable, the other not.
Runstats have been executed on both tables.

Another problem we have seen is, when we add an index on column A, the
access plan becomes totally different. DB2 will perform an index scan on
TABLE, then performs a FETCH and joins the result of the index scan with
TABLE (!), then does a TBSCAN and then returns the RESULT. The FETCH in this case takes 3''000''000''000 timerons.

As I mentioned above, we tested using optimization levels 0, 5 and 9.
Comparing the optimized SQL on Windows and Linux showed, that no
optimization was necessary, neither on Windows, nor on Linux.

Any ideas?

Regards

Rudolf Bargholz



Philip Sherman写道:
Philip Sherman wrote:
优化这种方法的一种方法是:
1.在表格中添加一个列,并使用大写的每个A列和B列的第一个字符填充它。这将需要插入和
更新触发器。 (仅在可以修改列内容时更新。)
2.在新的大写列上创建索引。
3.在新创建的索引上对表进行聚类。你将不得不重新组合表格以使聚类有效。
4.更改查询以使用新的列名称并将两个数据值组合成一对字母查询。
One way to optimize this is:
1. Add a column to the table, populating it with the first character of
each of the columns A and B, uppercased. This will require insert and
update triggers. (Update only if the column content can be modified.)
2. Create an index on the new upper cased column.
3. Cluster the table on the newly created index. You''ll have to reorg
the table to make the clustering efffective.
4. Change the query to use the new column name and combine the two data
values to a pair of letters in the query.




生成的列将是一种进一步简化这一点的方法,因为它们可以让你自动执行步骤1的部分(否)那里需要触发器)和第4步

(如果可能的话,db2会自动重新路由)。


-

Knut Stolze

信息集成

IBM德国/耶拿大学



generated columns would be a way to even further simplify this because they
allow you to automate part of step 1 (no triggers needed there) and step 4
(db2 will reroute automatically for you if possible).

--
Knut Stolze
Information Integration
IBM Germany / University of Jena


这篇关于Linux上的性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆