解析逗号分隔列表 [英] parsing a comma-separatinf list

查看:91
本文介绍了解析逗号分隔列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

曾几何时有一张桌子:


CREATE TABLE VENDOR(VENDOR_ID INT,NAME VARCHAR(50),STATE CHAR(2))@


在一段时间内,开发人员意识到供应商可能出现在

x几个州,因此结构改为


CREATE TABLE供应商(VENDOR_ID INT,NAME VARCHAR(50),STATE_LIST

VARCHAR(150))@


状态)LIST可以包含逗号分隔值列表像这样:

CA,WA,或者

不是最好的方法,但是在这里这样的事情

一直在发生。每个人都非常高兴,直到某人

需要这样的查询:

SELECT * FROM VENDOR WHERE STATE IN(''IL'',''WI'','' MI'')

SELECT COUNT(*),状态来自供应商集团状态

等等

给定由a开发的dayabase结构熟练的专业人士,这将是一块蛋糕,但不是这一次。


搬到更好的结构


CREATE TABLE VENDOR(VENDOR_ID INT,NAME VARCHAR(50))@

CREATE TABLE VENDOR_IN_STATE(VENDOR_ID INT,STATE CHAR(2))@


没有人想改变前端应用程序。

作为一种解决方法,我创建了一个视图和INSTEAD OF触发器,它们起作用如下:



INSERT INTO VENDOR_IN_STATE_VIEW VALUES(1,''ABC INC。'',''AK,AL,IL'')@


SELECT * FROM VENDOR
VENDOR_ID名称

----------- ------------------------ ------------------ --------

1 ABC INC。

1条记录被选中。


SELECT * FROM VENDOR_IN_STATE

VENDOR_ID状态

----------- -----

1 AK

1 AL

1 IL

3条记录被选中。


我使用递归来定义视图和触发器:


CREATE FUNCTION PARSE_LIST(C_LIST VARCHAR(100))

返回表(TOKEN VARCHAR(100))

SPECIFIC PARSE_LIST

返回

with PARSED_LIST(STEP,TOKEN,REMAINDER,LEFTMOST_COMMA)

AS(

VALUES (0,''',C_LIST,3)

UNION ALL

SELECT STEP + 1 AS STEP,

LEFTMOST_COMMA> 0时的情况那么

CAST(左(剩余,LEFTMOST_COMMA-1)作为CHAR(2))

ELSE

REMAINDER

结束为TOKEN,

LEFTMOST_COMMA时的情况> 0那么

CAST(SUBSTR(REMAINDER,LEFTMOST_COMMA + 1)AS VARCHAR(100))

ELSE

NULL

结束A. S REMAINDER,

LOCATE('','',SUBSTR(REMAINDER,LOCATE('','',REMAINDER)+ 1))

LEFTMOST_COMMA
来自PARSED_LIST

WHERE REMAINDER不为空



从PARSED_LIST中选择TOAKEN步骤> 0


我在UDF中包装了递归查询,以便可以重复使用:


调用此表UDF非常容易:

SELECT * FROM TABLE(PARSE_LIST(''AK,AR,IL,OH''))AS PARSE_LIST

TOKEN

------

AK

AR

IL


OH


我在INSTEAD OF触发器中使用UDF


CREATE TRIGGER VENDOR_IN_STATE_I

INSTEAD OF INSERT

ON VENDOR_IN_STATE_VIEW
参考新的N /

每个行模式DB2SQL

BEGIN原子

DECLARE VENDOR_FOUND INT;

SET VENDOR_FOUND =(SELECT VUNT(*)FROM VENDOR WHERE NAME = N.NAME);

IF NOT NOT(VENDOR_ FOUND> 0)

那么

INSERT INTO供应商(VENDOR_ID,NAME)价值(N.VENDOR_ID,N.NAME);

END IF ;

INSERT INTO VENDOR_IN_STATE(VENDOR_ID,STATE)

SELECT N.VENDOR_ID,PARSE_LIST.TOKEN来自

TABLE(PARSE_LIST(N.STATE_LIST)) )AS PARSE_LIST;

END @


我真的很感激任何反馈。

有没有更简单的方法?

解决方案

AK< ak ************ @ yahoo.com>写道:

曾几何时有一张桌子:

CREATE TABLE VENDOR(VENDOR_ID INT,NAME VARCHAR(50),STATE CHAR(2)) @

一段时间后,开发人员意识到供应商可能出现在几个州,所以结构改为
[...]我真的很感激任何反馈。
有没有更简单的方法?




这对我来说非常直接,类似于我写的

一次:
http:/ /www-106.ibm.com/developerwork...03stolze1.html


可能更快一点的唯一选择是使用外部

表函数,用C或Java编写。这样,您可以避免在SQL中进行

递归。我真的不知道表现是否有所改善,但是,b $ b。


-

Knut Stolze

信息集成

IBM德国/耶拿大学


为什么不保持表格不变(即用逗号分隔的状态) )和

创建一个视图,其中包含不同

行的状态名称。应该更容易,因为你不需要一个而不是触发器。


视图可以像
一样创建

select vendor_id,substr(state,locate(来自

供应商的'','',州,a * 3)+1,2),表(值(1),(2),(3),(4),(5) )作为vendor_ids(a)其中a< =

长度(状态)/ 3


Knut Stolze< st **** @ de.ibm .COM>在消息新闻中写道:< bs ********** @ fsuj29.rz.uni-jena.de> ...

AK< ak ****** ******@yahoo.com>写道:

曾几何时有一张桌子:

CREATE TABLE VENDOR(VENDOR_ID INT,NAME VARCHAR(50),STATE CHAR(2)) @

一段时间后,开发人员意识到供应商可能出现在xseveral状态,所以结构改为


[...]

我真的很感激任何反馈。
有没有更简单的方法?



这对我来说非常直接,类似于我写的
一次:
http://www-106.ibm.com/developerwork...03stolze1.html

可能更快一点的唯一选择是使用外部
表函数,用C或Java编写。这样,您可以避免SQL中的递归。我真的不知道表现是否有所改善,但是,



subaga< su ****** @ yahoo.com>写道:

为什么不保持表格(即以逗号分隔状态)和
创建一个视图,其中将有不同状态的名称
行。应该更容易,因为你不需要代替触发器。

视图可以像

选择vendor_id,substr(state,locate('','',state, a * 3)+1,2)来自
供应商,表(值(1),(2),(3),(4),(5))作为vendor_ids(a)其中a< = <长度(状态)/ 3




仅在以下情况下有效:

(1)状态总是正好2个字符长 - 如果他们不是,那么

你的计算不起作用。

(2)你最多有5个州 - 好吧,你可以通过添加更多来打开它

行到vendor_ids;根据实际数据,你可能需要几行

1000行(VARCHAR最长可达32K!),这可能不会更简单

然后。


-

Knut Stolze

信息集成

IBM德国/耶拿大学


Once upon a time there was a table:

CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50), STATE CHAR(2))@

in a while the developers realized that a vendor may be present in
xseveral states, so the structure was changed to

CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50), STATE_LIST
VARCHAR(150))@

STATE)LIST could contain a list of comma-separated values like this:
CA,WA,OR
Not the best approach, but out here in the field things like this
happen all the time. Everybody were absolutely happy until somebody
required a query like this:
SELECT * FROM VENDOR WHERE STATE IN (''IL'', ''WI'',''MI'')
SELECT COUNT(*), STATE FROM VENDOR GROUP BY STATE
and so on
Given a dayabase structure developed by a skilled professional, that
would be a piece of cake, but not this time.

Moving to a better structure

CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50))@
CREATE TABLE VENDOR_IN_STATE(VENDOR_ID INT, STATE CHAR(2))@

nobody wanted to change the front end application.
As a workaround, I created a view and INSTEAD OF triggers that worked
like this:

INSERT INTO VENDOR_IN_STATE_VIEW VALUES(1, ''ABC INC.'', ''AK,AL,IL'')@

SELECT * FROM VENDOR
VENDOR_ID NAME
----------- --------------------------------------------------
1 ABC INC.
1 record(s) selected.

SELECT * FROM VENDOR_IN_STATE
VENDOR_ID STATE
----------- -----
1 AK
1 AL
1 IL
3 record(s) selected.

I used recursion to define both the view and the triggers:

CREATE FUNCTION PARSE_LIST(C_LIST VARCHAR(100))
RETURNS TABLE(TOKEN VARCHAR(100))
SPECIFIC PARSE_LIST
RETURN
WITH PARSED_LIST(STEP, TOKEN, REMAINDER,LEFTMOST_COMMA)
AS(
VALUES(0, '''',C_LIST,3)
UNION ALL
SELECT STEP+1 AS STEP,
CASE WHEN LEFTMOST_COMMA>0 THEN
CAST(LEFT(REMAINDER,LEFTMOST_COMMA-1) AS CHAR(2))
ELSE
REMAINDER
END AS TOKEN,
CASE WHEN LEFTMOST_COMMA>0 THEN
CAST(SUBSTR(REMAINDER,LEFTMOST_COMMA+1) AS VARCHAR(100))
ELSE
NULL
END AS REMAINDER,
LOCATE('','',SUBSTR(REMAINDER,LOCATE('','',REMAINDER)+ 1)) AS
LEFTMOST_COMMA
FROM PARSED_LIST
WHERE REMAINDER IS NOT NULL
)
SELECT TOKEN FROM PARSED_LIST WHERE STEP>0

I wrapped the recursive query in UDF so that it could be reused:

It is very easy to invoke this table UDF:

SELECT * FROM TABLE(PARSE_LIST(''AK,AR,IL,OH'')) AS PARSE_LIST
TOKEN
------
AK
AR
IL

OH

I user the UDF in an INSTEAD OF trigger

CREATE TRIGGER VENDOR_IN_STATE_I
INSTEAD OF INSERT
ON VENDOR_IN_STATE_VIEW
REFERENCING NEW AS N
FOR EACH ROW MODE DB2SQL
BEGIN ATOMIC
DECLARE VENDOR_FOUND INT;
SET VENDOR_FOUND=(SELECT COUNT(*) FROM VENDOR WHERE NAME=N.NAME);
IF NOT(VENDOR_FOUND>0)
THEN
INSERT INTO VENDOR(VENDOR_ID, NAME) VALUES (N.VENDOR_ID, N.NAME);
END IF;
INSERT INTO VENDOR_IN_STATE(VENDOR_ID, STATE)
SELECT N.VENDOR_ID, PARSE_LIST.TOKEN FROM
TABLE(PARSE_LIST(N.STATE_LIST)) AS PARSE_LIST;
END @

I would really appreciate any feedback.
Are there any simpler approaches?

解决方案

AK <ak************@yahoo.com> wrote:

Once upon a time there was a table:

CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50), STATE CHAR(2))@

in a while the developers realized that a vendor may be present in
xseveral states, so the structure was changed to [...] I would really appreciate any feedback.
Are there any simpler approaches?



That looks pretty much straight forward to me and similar to what I wrote
once:
http://www-106.ibm.com/developerwork...03stolze1.html

The only alternative that might be a bit faster would be to use an external
table function, written in C or Java. That way, you could avoid the
recursion in SQL. I really don''t know if performance would improve or not,
however.

--
Knut Stolze
Information Integration
IBM Germany / University of Jena


why not keep the table as is (i.e. with comma separated states) and
create a view which will have the names of the states in different
rows. should be easier as u wont need a instead of trigger.

view can be created like

select vendor_id,substr(state,locate('','',state,a * 3)+1,2) from
vendor,table(values(1),(2),(3),(4),(5)) as vendor_ids(a) where a <=
length(state)/3

Knut Stolze <st****@de.ibm.com> wrote in message news:<bs**********@fsuj29.rz.uni-jena.de>...

AK <ak************@yahoo.com> wrote:

Once upon a time there was a table:

CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50), STATE CHAR(2))@

in a while the developers realized that a vendor may be present in
xseveral states, so the structure was changed to


[...]

I would really appreciate any feedback.
Are there any simpler approaches?



That looks pretty much straight forward to me and similar to what I wrote
once:
http://www-106.ibm.com/developerwork...03stolze1.html

The only alternative that might be a bit faster would be to use an external
table function, written in C or Java. That way, you could avoid the
recursion in SQL. I really don''t know if performance would improve or not,
however.



subaga <su******@yahoo.com> wrote:

why not keep the table as is (i.e. with comma separated states) and
create a view which will have the names of the states in different
rows. should be easier as u wont need a instead of trigger.

view can be created like

select vendor_id,substr(state,locate('','',state,a * 3)+1,2) from
vendor,table(values(1),(2),(3),(4),(5)) as vendor_ids(a) where a <=
length(state)/3



That only works if:
(1) the states are always exactly 2 characters long - if they are not, then
your calculation doesn''t work.
(2) you have at most 5 states - ok, you could open this up by adding more
rows to "vendor_ids"; depending on the actual data, you might need several
1000 rows (VARCHARs can be up to 32K long!), and that might not be simpler
then.

--
Knut Stolze
Information Integration
IBM Germany / University of Jena


这篇关于解析逗号分隔列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆