内部联合/指数阈值? [英] INNER JOIN/Index Threshold?

查看:55
本文介绍了内部联合/指数阈值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

伙计们,我真的被这个问题困住了。任何帮助或建议将是

赞赏。

我们有一个大桌子,似乎只是达到某种阈值。

他们查询是

表中没有索引时有些响应。但是,当我们索引电子邮件时,查询需要永远。


事实

- 问题非常数据特定。我无法使用不同的数据重新创建

问题。

- 当我在基表上索引电子邮件时,只有一个问题。

- 当我添加AND b.email IS NOT NULL时,问题就消失了。到

内连接条件。当我将逻辑添加到

WHERE时,它没有用。子句。

DDL

CREATE TABLE base(bk char(25),email varchar(100))

在基础上创建聚集索引icx( bk)

在基础上创建索引ix_email(电子邮件)

CREATE TABLE过滤器(bk char(25),email varchar(100))

在过滤器上创建聚集索引icx(bk)

在过滤器上创建索引ix_email(电子邮件)


查询

SELECT b.bk ,b.email

FROM base b WITH(NOLOCK)

INNER JOIN过滤器f ON f.email = b.email

- 和f.email不为空


数据资料

--35120500,35120491,14221553

SELECT COUNT(*),COUNT (DISTINCT BK),COUNT(DISTINCT电子邮件)

从基站

--16796199,16796192,14221553

SELECT COUNT(*),COUNT (DISTINCT bk),COUNT(DISTINCT email)

FROM base

WHERE email is not NULL


- 250552,250552 ,250205

SELECT COUNT(*),COUNT(DISTINCT bk),COUNT(DISTINCT email)

FROM fil之三

--250208,250208,250205

SELECT COUNT(*),COUNT(DISTINCT BK),COUNT(DISTINCT电子邮件)

起价过滤器

WHERE电子邮件不是空的

Guys I am really stuck on this one. Any help or suggestions would be
appreciated.
We have a large table which seemed to just hit some kind of threshold.
They query is somewhat responsive when there are NO indexes on the
table. However, when we index email the query takes forever.

FACTS
- The problem is very "data specific". I can not recreate the
problem using different data.
- There is only a problem when I index email on the base table.
- The problem goes away when I add "AND b.email IS NOT NULL" to the
inner join condition. It does not help when I add the logic to the
"WHERE" clause.
DDL
CREATE TABLE base (bk char(25), email varchar(100))
create clustered index icx on base(bk)
create index ix_email on base(email)
CREATE TABLE filter (bk char(25), email varchar(100))
create clustered index icx on filter (bk)
create index ix_email on filter (email)

Query
SELECT b.bk, b.email
FROM base b WITH(NOLOCK)
INNER JOIN filter f ON f.email = b.email
--and f.email is not null

Data Profile
--35120500, 35120491, 14221553
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM base
--16796199, 16796192, 14221553
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM base
WHERE email IS NOT NULL

--250552, 250552, 250205
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM filter
--250208, 250208, 250205
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM filter
WHERE email IS NOT NULL

推荐答案

Dave写道:
Dave wrote:

伙计们,我真的被这个问题困住了。任何帮助或建议将是

赞赏。


我们有一张大桌子似乎只是达到某种门槛。



表中没有索引时,它们的查询有些响应。但是,当我们索引电子邮件时,查询需要永远。


事实

- 问题非常数据特定。我无法使用不同的数据重新创建

问题。

- 当我在基表上索引电子邮件时,只有一个问题。

- 当我添加AND b.email IS NOT NULL时,问题就消失了。到

内连接条件。当我将逻辑添加到

WHERE时,它没有用。子句。


DDL

CREATE TABLE base(bk char(25),email varchar(100))

create clustered index icx on base(bk)

在基础上创建索引ix_email(电子邮件)


CREATE TABLE过滤器(bk char(25),email varchar(100))

在过滤器上创建聚集索引icx(bk)

在过滤器上创建索引ix_email(电子邮件)


查询

SELECT b.bk,b.email

FROM base b WITH(NOLOCK)

INNER JOIN过滤器f ON f.email = b.email

- 和f.email不为空



数据资料

--35120500,35120491,14221553
SELECT COUNT(*),COUNT(DISTINCT BK),COUNT(DISTINCT电子邮件)

从基站


--16796199,16796192, 14221553

SELECT COUNT(*),COUNT(DISTINCT bk),COUNT(DISTINCT email)

FROM FROM

WHERE email IS NOT NULL


- 250552,250552,250205

SELECT COUNT(*),COUNT(DISTINCT bk),COUNT(DISTIN) CT电子邮件)

FROM过滤器


- 250208,250208,250205

SELECT COUNT(*),COUNT(DISTINCT bk ),COUNT(DISTINCT电子邮件)

FROM过滤器

WHERE电子邮件不是空的
Guys I am really stuck on this one. Any help or suggestions would be
appreciated.
We have a large table which seemed to just hit some kind of threshold.
They query is somewhat responsive when there are NO indexes on the
table. However, when we index email the query takes forever.

FACTS
- The problem is very "data specific". I can not recreate the
problem using different data.
- There is only a problem when I index email on the base table.
- The problem goes away when I add "AND b.email IS NOT NULL" to the
inner join condition. It does not help when I add the logic to the
"WHERE" clause.
DDL
CREATE TABLE base (bk char(25), email varchar(100))
create clustered index icx on base(bk)
create index ix_email on base(email)
CREATE TABLE filter (bk char(25), email varchar(100))
create clustered index icx on filter (bk)
create index ix_email on filter (email)

Query
SELECT b.bk, b.email
FROM base b WITH(NOLOCK)
INNER JOIN filter f ON f.email = b.email
--and f.email is not null

Data Profile
--35120500, 35120491, 14221553
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM base
--16796199, 16796192, 14221553
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM base
WHERE email IS NOT NULL

--250552, 250552, 250205
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM filter
--250208, 250208, 250205
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM filter
WHERE email IS NOT NULL



估计是什么执行计划看起来像?尝试添加列

BK作为电子邮件索引中的第二个键。

-

Tracy McKibben

MCDBA
http://www.realsqlguy.com


bk是聚类的,所以没有理由把它添加到另一个索引。


估计的计划在每个表上进行索引扫描,然后合并加入。


当我删除base.email上的索引时,计划会在

base上进行表扫描然后执行哈希匹配

Tracy McKibben写道:
bk is clustered so there is no reason to add it to the other index.

Estimated plan does an index scan on each table, then a merge join.

When I remove the index on base.email the plan does a table scan on
base then performs a Hash Match
Tracy McKibben wrote:

Dave写道:
Dave wrote:

伙计我真的被困在这一个。任何帮助或建议将是

赞赏。

我们有一个大桌子,似乎只是达到某种阈值。

他们查询是

表中没有索引时有些响应。但是,当我们索引电子邮件时,查询需要永远。


事实

- 问题非常数据特定。我无法使用不同的数据重新创建

问题。

- 当我在基表上索引电子邮件时,只有一个问题。

- 当我添加AND b.email IS NOT NULL时,问题就消失了。到

内连接条件。当我将逻辑添加到

WHERE时,它没有用。子句。

DDL

CREATE TABLE base(bk char(25),email varchar(100))

在基础上创建聚集索引icx( bk)

在基础上创建索引ix_email(电子邮件)

CREATE TABLE过滤器(bk char(25),email varchar(100))

在过滤器上创建聚集索引icx(bk)

在过滤器上创建索引ix_email(电子邮件)


查询

SELECT b.bk ,b.email

FROM base b WITH(NOLOCK)

INNER JOIN过滤器f ON f.email = b.email

- 和f.email不为空


数据资料

--35120500,35120491,14221553

SELECT COUNT(*),COUNT (DISTINCT BK),COUNT(DISTINCT电子邮件)

从基站

--16796199,16796192,14221553

SELECT COUNT(*),COUNT (DISTINCT bk),COUNT(DISTINCT email)

FROM base

WHERE email is not NULL


- 250552,250552 ,250205

SELECT COUNT(*),COUNT(DISTINCT bk),COUNT(DISTINCT)电子邮件)

FROM过滤器

- 250208,250208,250205

SELECT COUNT(*),COUNT(DISTINCT bk),COUNT(DISTINCT电子邮件)

FROM过滤器

WHERE电子邮件IS NOT NULL
Guys I am really stuck on this one. Any help or suggestions would be
appreciated.
We have a large table which seemed to just hit some kind of threshold.
They query is somewhat responsive when there are NO indexes on the
table. However, when we index email the query takes forever.

FACTS
- The problem is very "data specific". I can not recreate the
problem using different data.
- There is only a problem when I index email on the base table.
- The problem goes away when I add "AND b.email IS NOT NULL" to the
inner join condition. It does not help when I add the logic to the
"WHERE" clause.
DDL
CREATE TABLE base (bk char(25), email varchar(100))
create clustered index icx on base(bk)
create index ix_email on base(email)
CREATE TABLE filter (bk char(25), email varchar(100))
create clustered index icx on filter (bk)
create index ix_email on filter (email)

Query
SELECT b.bk, b.email
FROM base b WITH(NOLOCK)
INNER JOIN filter f ON f.email = b.email
--and f.email is not null

Data Profile
--35120500, 35120491, 14221553
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM base
--16796199, 16796192, 14221553
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM base
WHERE email IS NOT NULL

--250552, 250552, 250205
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM filter
--250208, 250208, 250205
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM filter
WHERE email IS NOT NULL



估计执行计划是什么样的?尝试添加列

BK作为电子邮件索引中的第二个键。


-

Tracy McKibben

MCDBA
http://www.realsqlguy.com


Dave,


你们两张桌子上都有很多空电子邮件吗?如果是这样,那可能是影响因素。


- 比尔


Dave < da ****** @ gmail.comwrote in message

news:11 ********************** @ s80g2000cwa .googlegr oups.com ...
Dave,

Do you have a lot of null emails on both tables? If so, that might be an
influencing factor.

-- Bill

"Dave" <da******@gmail.comwrote in message
news:11**********************@s80g2000cwa.googlegr oups.com...

伙计我真的被困在这一个上了。任何帮助或建议将是

赞赏。


我们有一张大桌子似乎只是达到某种门槛。



表中没有索引时,它们的查询有些响应。但是,当我们索引电子邮件时,查询需要永远。


事实

- 问题非常数据特定。我无法使用不同的数据重新创建

问题。

- 当我在基表上索引电子邮件时,只有一个问题。

- 当我添加AND b.email IS NOT NULL时,问题就消失了。到

内连接条件。当我将逻辑添加到

WHERE时,它没有用。子句。


DDL

CREATE TABLE base(bk char(25),email varchar(100))

create clustered index icx on base(bk)

在基础上创建索引ix_email(电子邮件)


CREATE TABLE过滤器(bk char(25),email varchar(100))

在过滤器上创建聚集索引icx(bk)

在过滤器上创建索引ix_email(电子邮件)


查询

SELECT b.bk,b.email

FROM base b WITH(NOLOCK)

INNER JOIN过滤器f ON f.email = b.email

- 和f.email不为空



数据资料

--35120500,35120491,14221553
SELECT COUNT(*),COUNT(DISTINCT bk),COUNT(DISTINCT email)

FROM base


--16796199,1679192, 14221553

SELECT COUNT(*),COUNT(DISTINCT bk),COUNT(DISTINCT email)

FROM FROM

WHERE email IS NOT NULL


- 250552,250552,250205

SELECT COUNT(*),COUNT(DISTINCT bk),COUNT(DISTINCT em ail)

来自过滤器


- 250208,250208,250205

SELECT COUNT(*),COUNT(DISTINCT bk) ,COUNT(DISTINCT电子邮件)

FROM过滤器

WHERE电子邮件不是空白
Guys I am really stuck on this one. Any help or suggestions would be
appreciated.
We have a large table which seemed to just hit some kind of threshold.
They query is somewhat responsive when there are NO indexes on the
table. However, when we index email the query takes forever.

FACTS
- The problem is very "data specific". I can not recreate the
problem using different data.
- There is only a problem when I index email on the base table.
- The problem goes away when I add "AND b.email IS NOT NULL" to the
inner join condition. It does not help when I add the logic to the
"WHERE" clause.
DDL
CREATE TABLE base (bk char(25), email varchar(100))
create clustered index icx on base(bk)
create index ix_email on base(email)
CREATE TABLE filter (bk char(25), email varchar(100))
create clustered index icx on filter (bk)
create index ix_email on filter (email)

Query
SELECT b.bk, b.email
FROM base b WITH(NOLOCK)
INNER JOIN filter f ON f.email = b.email
--and f.email is not null

Data Profile
--35120500, 35120491, 14221553
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM base
--16796199, 16796192, 14221553
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM base
WHERE email IS NOT NULL

--250552, 250552, 250205
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM filter
--250208, 250208, 250205
SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
FROM filter
WHERE email IS NOT NULL



这篇关于内部联合/指数阈值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆