递归查询 [英] Recursive Query

查看:70
本文介绍了递归查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,


需要一些帮助,我的肚子里会产生大量的酸性物质...... b $ b ... br />

有一个表存储给定客户的销售指标。每年存储的销售额为b $ b,每个客户每年可能有多个销售额。还有另一个字段名为last

更新日期。如果有多个销售措施,那么需要选择

根据此字段输入的最后一个。此外,如果有
没有当年的销售指标数据,那么我需要返回去年输入销售指标的

数据。对于

示例:如果客户#1的1999年销售指标值为200美元,那么

则没有,那么我需要在1999年后的任何一年返还200美元。


所以查询看起来像这样:

SELECT client_name,sm_dollars FROM< tables>


基于底部的DDL,我希望能回来:c1,100;

c2,200


它有效,但它的速度非常慢。 SQL Server正在扫描

索引,并进行合并连接,在大型查询中占用

时间的%95。以下是查询计划的一部分:


| | | | | | | --Merge

加入(内部加入,MANY-TO-MANY

MERGE:([sales_measure]。[client_id])=([sales_measure]。[client_id] ),残留:(([sales_measure]。[client_id] = [sales_measure]。[client_id]

AND [sales_measure]。[tax_year] = [sales_measure]。[tax_year])AND

[Expr1013] = [sales_measure]。[last_update_date]))

| | | | | | | --Stream

汇总(GROUP BY :( [sales_measure]。[client_id],

[sales_measure]。[tax_year])

DEFINE :( [Expr1013] = MAX([sales_measure]。[last_update_date])))

| | | | | | | | --Merge

加入(内部加入,合并:([sales_measure]。[client_id],

[Expr1010])=([sales_measure]。[client_id], [sales_measure]。[tax_year]),

剩余:([sales_measure]。[client_id] = [sales_measure]。[client_id]和

[sales_measure]。[tax_year ] = [Expr1010]))

| | | | | | |
| - 流聚合(GROUP BY:([sales_measure]。[client_id])

DEFINE :( [Expr1010] = MAX([sales_measure]。[tax_year ])))

| | | | | | | |

| - 指数扫描(OBJECT:([stars_perftest]。[dbo]。[sales_measure]。[sales_measure_idx1]),

订购前进)

| | | | | | |

| - 指数扫描(OBJECT:([stars_perftest]。[dbo]。[sales_measure]。[sales_measure_idx1]),

订购前进)

| | | | | | | - 索引

扫描(对象:([stars_perftest]。[dbo]。[sales_measure]。[sales_measure_idx1]),

订购前进)

销售量表上有两个索引:


sales_measure_pk - sales_measure_id(主键)群集

sales_measure_idx1 - client_id,tax_year,last_update_date,sm_dollars


sales_measure表中有800,000行。


以下是DDL的其余部分:


如果OBJECT_ID(''dbo.client'')不是NULL

DROP TABLE dbo.client

GO

create table dbo.client (

client_id int身份主键

,client_name varchar(100)NOT NULL)

GO

如果OBJECT_ID (''dbo.sales_measure'')IS NOT NULL

DROP TABLE dbo.sales_measure

GO

create table dbo.sales_measure(

sales_measure_id int identity主键

,client_id int NOT NULL

,tax_year smallint NOT N ULL

,sm_dollars money NOT NULL

,last_update_date datetime NOT NULL)

GO

CREATE INDEX sales_measure_idx1 ON sales_measure (client_id,tax_year,

last_update_date,sm_dollars)

GO

INSERT dbo.client(client_name)

SELECT ''c1''UNION SELECT''c2''UNION SELECT''c3''

GO

INSERT dbo.sales_measure(client_id,tax_year,sm_dollars,
last_update_date)

SELECT 1,2004,100''1/4/2004''

UNION

SELECT 2 ,2003,100,''1/3/2004''

UNION

SELECT 2,2004,150,''1/4/2004''

UNION

SELECT 2,2004,''1/5/2004''

我用来计算销售指标的视图:


创建视图sales_measure_vw AS

SELECT sm。*

FROM sales_measure sm

INNER JOIN(SELECT sm2 .client_id,sm2.tax_year,

MAX(sm2.last_update_date)as last_update_date

FROM sales_measure sm2

INNER JOIN(选择sm4.client_id,MAX(sm4.tax_year)

作为tax_year

来自sales_measure sm4 GROUP BY

sm4.client_id)sm3

on sm3.client_id = sm2.client_id

和sm3.tax_year = sm2.tax_year

GROUP BY sm2.client_id,sm2.tax_year)sm1

ON sm.client_id = sm1.client_id AND

sm.tax_year = sm1.tax_year AND

sm.last_update_date = sm1.last_update_date


任何关于如何驯服这个的建议都将不胜感激。此外,索引上的任何建议

也会有所帮助。


谢谢


Bob

Hi there,

Need a little help with a certain query that''s causing a lot of acid
in my stomach...

Have a table that stores sales measures for a given client. The sales
measures are stored per year and there could be multiple sales
measures every year per client. There is another field called last
update date. If there are multiple sales measures then need to select
the one that''s been entered last based on this field. Also, if there
is no sales measure data for current year then I need to return the
last year''s data for which sales measure has been entered. For
example: if client #1 has sales measure value of $200 for 1999 and
nothing since, then I need to return $200 for any year following 1999.

So the query would look something like this:

SELECT client_name, sm_dollars FROM <tables>

Based on the DDL at the bottom I would expect to get back: c1, 100;
c2, 200

The way I am doing it now is with correlated subqueries (3 to be
exact) that each do an aggregate and join back to the original table.
It works, but it is notoriously slow. SQL Server is scanning the
index and does a merge join which in a large query takes %95 of the
time. Here is the part of the query plan for it:

| | | | | | |--Merge
Join(Inner Join, MANY-TO-MANY
MERGE:([sales_measure].[client_id])=([sales_measure].[client_id]),RESIDUAL:(([sales_measure].[client_id]=[sales_measure].[client_id]
AND [sales_measure].[tax_year]=[sales_measure].[tax_year]) AND
[Expr1013]=[sales_measure].[last_update_date]))
| | | | | | |--Stream
Aggregate(GROUP BY:([sales_measure].[client_id],
[sales_measure].[tax_year])
DEFINE:([Expr1013]=MAX([sales_measure].[last_update_date])))
| | | | | | | |--Merge
Join(Inner Join, MERGE:([sales_measure].[client_id],
[Expr1010])=([sales_measure].[client_id], [sales_measure].[tax_year]),
RESIDUAL:([sales_measure].[client_id]=[sales_measure].[client_id] AND
[sales_measure].[tax_year]=[Expr1010]))
| | | | | | |
|--Stream Aggregate(GROUP BY:([sales_measure].[client_id])
DEFINE:([Expr1010]=MAX([sales_measure].[tax_year])))
| | | | | | | |
|--Index Scan(OBJECT:([stars_perftest].[dbo].[sales_measure].[sales_measure_idx1]),
ORDERED FORWARD)
| | | | | | |
|--Index Scan(OBJECT:([stars_perftest].[dbo].[sales_measure].[sales_measure_idx1]),
ORDERED FORWARD)
| | | | | | |--Index
Scan(OBJECT:([stars_perftest].[dbo].[sales_measure].[sales_measure_idx1]),
ORDERED FORWARD)
There are two indexes on sales measure table:

sales_measure_pk - sales_measure_id (primary key) clustered
sales_measure_idx1 - client_id, tax_year, last_update_date, sm_dollars

sales_measure table has 800,000 rows in it.

Here is the rest of the DDL:

IF OBJECT_ID(''dbo.client'') IS NOT NULL
DROP TABLE dbo.client
GO
create table dbo.client (
client_id int identity primary key
, client_name varchar(100) NOT NULL)
GO
IF OBJECT_ID(''dbo.sales_measure'') IS NOT NULL
DROP TABLE dbo.sales_measure
GO
create table dbo.sales_measure(
sales_measure_id int identity primary key
, client_id int NOT NULL
, tax_year smallint NOT NULL
, sm_dollars money NOT NULL
, last_update_date datetime NOT NULL)
GO
CREATE INDEX sales_measure_idx1 ON sales_measure (client_id, tax_year,
last_update_date, sm_dollars)
GO
INSERT dbo.client(client_name)
SELECT ''c1'' UNION SELECT ''c2'' UNION SELECT ''c3''
GO
INSERT dbo.sales_measure(client_id, tax_year, sm_dollars,
last_update_date)
SELECT 1, 2004, 100, ''1/4/2004''
UNION
SELECT 2, 2003, 100, ''1/3/2004''
UNION
SELECT 2, 2004, 150, ''1/4/2004''
UNION
SELECT 2, 2004, 200, ''1/5/2004''
The view that I use to calculate sales measures:

CREATE VIEW sales_measure_vw AS
SELECT sm.*
FROM sales_measure sm
INNER JOIN (SELECT sm2.client_id, sm2.tax_year,
MAX(sm2.last_update_date) as last_update_date
FROM sales_measure sm2
INNER JOIN (SELECT sm4.client_id, MAX(sm4.tax_year)
as tax_year
FROM sales_measure sm4 GROUP BY
sm4.client_id) sm3
on sm3.client_id = sm2.client_id
and sm3.tax_year = sm2.tax_year
GROUP BY sm2.client_id, sm2.tax_year ) sm1
ON sm.client_id = sm1.client_id AND
sm.tax_year = sm1.tax_year AND
sm.last_update_date = sm1.last_update_date

Any advice on how to tame this would be appreciated. Also, any advice
on the indexes would help as well.

Thanks

Bob

推荐答案

200 for 1999和

什么都没有,然后我需要返回
200 for 1999 and
nothing since, then I need to return


200 for 1999年之后的任何一年。


所以查询看起来像这样:

SELECT client_name,sm_dollars FROM< tables>


基于底部的DDL,我希望能回来:c1,100;

c2,200

我现在这样做的方式是使用相关子查询(确切地说是3个b
),每个子查询都会聚合并连接回原始表格。

它有效,但是这是出了名的慢。 SQL Server正在扫描

索引,并进行合并连接,在大型查询中占用

时间的%95。以下是查询计划的一部分:


| | | | | | | --Merge

加入(内部加入,MANY-TO-MANY

MERGE:([sales_measure]。[client_id])=([sales_measure]。[client_id] ),残留:(([sales_measure]。[client_id] = [sales_measure]。[client_id]

AND [sales_measure]。[tax_year] = [sales_measure]。[tax_year])AND

[Expr1013] = [sales_measure]。[last_update_date]))

| | | | | | | --Stream

汇总(GROUP BY :( [sales_measure]。[client_id],

[sales_measure]。[tax_year])

DEFINE :( [Expr1013] = MAX([sales_measure]。[last_update_date])))

| | | | | | | | --Merge

加入(内部加入,合并:([sales_measure]。[client_id],

[Expr1010])=([sales_measure]。[client_id], [sales_measure]。[tax_year]),

剩余:([sales_measure]。[client_id] = [sales_measure]。[client_id]和

[sales_measure]。[tax_year ] = [Expr1010]))

| | | | | | |
| - 流聚合(GROUP BY:([sales_measure]。[client_id])

DEFINE :( [Expr1010] = MAX([sales_measure]。[tax_year ])))

| | | | | | | |

| - 指数扫描(OBJECT:([stars_perftest]。[dbo]。[sales_measure]。[sales_measure_idx1]),

订购前进)

| | | | | | |

| - 指数扫描(OBJECT:([stars_perftest]。[dbo]。[sales_measure]。[sales_measure_idx1]),

订购前进)

| | | | | | | - 索引

扫描(对象:([stars_perftest]。[dbo]。[sales_measure]。[sales_measure_idx1]),

订购前进)

销售量表上有两个索引:


sales_measure_pk - sales_measure_id(主键)群集

sales_measure_idx1 - client_id,tax_year,last_update_date,sm_dollars


sales_measure表中有800,000行。


以下是DDL的其余部分:


如果OBJECT_ID(''dbo.client'')不是NULL

DROP TABLE dbo.client

GO

create table dbo.client (

client_id int身份主键

,client_name varchar(100)NOT NULL)

GO

如果OBJECT_ID (''dbo.sales_measure'')IS NOT NULL

DROP TABLE dbo.sales_measure

GO

create table dbo.sales_measure(

sales_measure_id int identity主键

,client_id int NOT NULL

,tax_year smallint NOT N ULL

,sm_dollars money NOT NULL

,last_update_date datetime NOT NULL)

GO

CREATE INDEX sales_measure_idx1 ON sales_measure (client_id,tax_year,

last_update_date,sm_dollars)

GO

INSERT dbo.client(client_name)

SELECT ''c1''UNION SELECT''c2''UNION SELECT''c3''

GO

INSERT dbo.sales_measure(client_id,tax_year,sm_dollars,
last_update_date)

SELECT 1,2004,100''1/4/2004''

UNION

SELECT 2 ,2003,100,''1/3/2004''

UNION

SELECT 2,2004,150,''1/4/2004''

UNION

SELECT 2,2004,''1/5/2004''

我用来计算销售指标的视图:


创建视图sales_measure_vw AS

SELECT sm。*

FROM sales_measure sm

INNER JOIN(SELECT sm2 .client_id,sm2.tax_year,

MAX(sm2.last_update_date)as last_update_date

FROM sales_measure sm2

INNER JOIN(选择sm4.client_id,MAX(sm4.tax_year)

作为tax_year

来自sales_measure sm4 GROUP BY

sm4.client_id)sm3

on sm3.client_id = sm2.client_id

和sm3.tax_year = sm2.tax_year

GROUP BY sm2.client_id,sm2.tax_year)sm1

ON sm.client_id = sm1.client_id AND

sm.tax_year = sm1.tax_year AND

sm.last_update_date = sm1.last_update_date


任何关于如何驯服这个的建议都将不胜感激。此外,索引上的任何建议

也会有所帮助。


谢谢


Bob
200 for any year following 1999.

So the query would look something like this:

SELECT client_name, sm_dollars FROM <tables>

Based on the DDL at the bottom I would expect to get back: c1, 100;
c2, 200

The way I am doing it now is with correlated subqueries (3 to be
exact) that each do an aggregate and join back to the original table.
It works, but it is notoriously slow. SQL Server is scanning the
index and does a merge join which in a large query takes %95 of the
time. Here is the part of the query plan for it:

| | | | | | |--Merge
Join(Inner Join, MANY-TO-MANY
MERGE:([sales_measure].[client_id])=([sales_measure].[client_id]),RESIDUAL:(([sales_measure].[client_id]=[sales_measure].[client_id]
AND [sales_measure].[tax_year]=[sales_measure].[tax_year]) AND
[Expr1013]=[sales_measure].[last_update_date]))
| | | | | | |--Stream
Aggregate(GROUP BY:([sales_measure].[client_id],
[sales_measure].[tax_year])
DEFINE:([Expr1013]=MAX([sales_measure].[last_update_date])))
| | | | | | | |--Merge
Join(Inner Join, MERGE:([sales_measure].[client_id],
[Expr1010])=([sales_measure].[client_id], [sales_measure].[tax_year]),
RESIDUAL:([sales_measure].[client_id]=[sales_measure].[client_id] AND
[sales_measure].[tax_year]=[Expr1010]))
| | | | | | |
|--Stream Aggregate(GROUP BY:([sales_measure].[client_id])
DEFINE:([Expr1010]=MAX([sales_measure].[tax_year])))
| | | | | | | |
|--Index Scan(OBJECT:([stars_perftest].[dbo].[sales_measure].[sales_measure_idx1]),
ORDERED FORWARD)
| | | | | | |
|--Index Scan(OBJECT:([stars_perftest].[dbo].[sales_measure].[sales_measure_idx1]),
ORDERED FORWARD)
| | | | | | |--Index
Scan(OBJECT:([stars_perftest].[dbo].[sales_measure].[sales_measure_idx1]),
ORDERED FORWARD)
There are two indexes on sales measure table:

sales_measure_pk - sales_measure_id (primary key) clustered
sales_measure_idx1 - client_id, tax_year, last_update_date, sm_dollars

sales_measure table has 800,000 rows in it.

Here is the rest of the DDL:

IF OBJECT_ID(''dbo.client'') IS NOT NULL
DROP TABLE dbo.client
GO
create table dbo.client (
client_id int identity primary key
, client_name varchar(100) NOT NULL)
GO
IF OBJECT_ID(''dbo.sales_measure'') IS NOT NULL
DROP TABLE dbo.sales_measure
GO
create table dbo.sales_measure(
sales_measure_id int identity primary key
, client_id int NOT NULL
, tax_year smallint NOT NULL
, sm_dollars money NOT NULL
, last_update_date datetime NOT NULL)
GO
CREATE INDEX sales_measure_idx1 ON sales_measure (client_id, tax_year,
last_update_date, sm_dollars)
GO
INSERT dbo.client(client_name)
SELECT ''c1'' UNION SELECT ''c2'' UNION SELECT ''c3''
GO
INSERT dbo.sales_measure(client_id, tax_year, sm_dollars,
last_update_date)
SELECT 1, 2004, 100, ''1/4/2004''
UNION
SELECT 2, 2003, 100, ''1/3/2004''
UNION
SELECT 2, 2004, 150, ''1/4/2004''
UNION
SELECT 2, 2004, 200, ''1/5/2004''
The view that I use to calculate sales measures:

CREATE VIEW sales_measure_vw AS
SELECT sm.*
FROM sales_measure sm
INNER JOIN (SELECT sm2.client_id, sm2.tax_year,
MAX(sm2.last_update_date) as last_update_date
FROM sales_measure sm2
INNER JOIN (SELECT sm4.client_id, MAX(sm4.tax_year)
as tax_year
FROM sales_measure sm4 GROUP BY
sm4.client_id) sm3
on sm3.client_id = sm2.client_id
and sm3.tax_year = sm2.tax_year
GROUP BY sm2.client_id, sm2.tax_year ) sm1
ON sm.client_id = sm1.client_id AND
sm.tax_year = sm1.tax_year AND
sm.last_update_date = sm1.last_update_date

Any advice on how to tame this would be appreciated. Also, any advice
on the indexes would help as well.

Thanks

Bob


你真的有100个角色名字的客户吗?好吧,如果你没有花时间正确设计表格中的列,那么你将获得
。为什么你认为IDENTITY可以用作钥匙?

纳税年度是从创建日期开始计算的,因此您有冗余。

不是销售措施的真正关键是(client_id,
creation_date)?你缺少DRI。除非您只有一个客户

和一个销售衡量标准,否则您应该使用复数或集体名词

作为表名;表是集合,而不是标量。


您使用了专有的MONEY数据类型而不是有效的SQL

数据类型。 ISO-8601标准日期格式yyyy-mm-dd是标准SQL中允许的唯一一个

。所以清理DDL,我们得到一些东西

喜欢:


CREATE TABLE客户端

(client_id INTEGER NOT NULL PRIMARY KEY,

client_name VARCHAR(35)NOT NULL);


CREATE TABLE SalesMeasures

(client_id INTEGER NOT NULL

REFERENCES客户端(client_id),

creation_date DATETIME非空,

sm_dollars DECIMAL(12,4)NOT NULL,

PRIMARY KEY (client_id,creation_date));
Do you really have clients with 100 character names? Well, if you
don''t take the time to design the columns in the tables properly, you
will. Why do you think that an IDENTITY can ever be used as a key? The
tax year is computed from the creation date, so you have a redundancy.
Isn''t the real key for the sales measures is (client_id,
creation_date)? You are missing DRI. Unless you have only one client
and one sales measurement, you ought to use plural or collective nouns
for the table names; tables are sets, not scalars.

You used the proprietary MONEY data type instead of a valid SQL
datatype. The ISO-8601 standard date format "yyyy-mm-dd" is the only
one allowed in Standard SQL. So cleaning up the DDL, we get something
like:

CREATE TABLE Clients
(client_id INTEGER NOT NULL PRIMARY KEY,
client_name VARCHAR(35) NOT NULL);

CREATE TABLE SalesMeasures
(client_id INTEGER NOT NULL
REFERENCES Clients (client_id),
creation_date DATETIME NOT NULL,
sm_dollars DECIMAL(12,4) NOT NULL,
PRIMARY KEY (client_id, creation_date));
有一个表存储给定客户的销售度量。
销售指标每年存储一次,每个客户每年可以有多个销售额b / b $ b。还有另一个字段[原文如此]称为

上次更新日期。如果有多个销售措施,那么需要

根据此字段[原文如此]选择最后输入的那个。 <<


字段和列是完全不同的概念。你所拥有的是销售措施的历史记录

。你不断添加它,而不是更新

它。

另外,如果当前没有销售指标数据那么我
Have a table that stores sales measures for a given client. The sales measures are stored per year and there could be multiple sales
measures every year per client. There is another field [sic] called
last update date. If there are multiple sales measures then need to
select the one that''s been entered last based on this field [sic]. <<

Fields and columns are totally different concepts. What you have is a
history of the Sales measures. You keep adding to it, not updating
it.
Also, if there is no sales measure data for current year then I



需要返回去年销售额为

的数据。例如:如果客户#1的销售度量值为


need to return the last year''s data for which sales measure has been
entered. For example: if client #1 has sales measure value of


这篇关于递归查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆