解析逗号分隔列表 [英] parsing a comma-separatinf list
问题描述
曾几何时有一张桌子:
CREATE TABLE VENDOR(VENDOR_ID INT,NAME VARCHAR(50),STATE CHAR(2))@
在一段时间内,开发人员意识到供应商可能出现在
x几个州,因此结构改为
CREATE TABLE供应商(VENDOR_ID INT,NAME VARCHAR(50),STATE_LIST
VARCHAR(150))@
状态)LIST可以包含逗号分隔值列表像这样:
CA,WA,或者
不是最好的方法,但是在这里这样的事情
一直在发生。每个人都非常高兴,直到某人
需要这样的查询:
SELECT * FROM VENDOR WHERE STATE IN(''IL'',''WI'','' MI'')
SELECT COUNT(*),状态来自供应商集团状态
等等
给定由a开发的dayabase结构熟练的专业人士,这将是一块蛋糕,但不是这一次。
搬到更好的结构
CREATE TABLE VENDOR(VENDOR_ID INT,NAME VARCHAR(50))@
CREATE TABLE VENDOR_IN_STATE(VENDOR_ID INT,STATE CHAR(2))@
没有人想改变前端应用程序。
作为一种解决方法,我创建了一个视图和INSTEAD OF触发器,它们起作用如下:
:
>
INSERT INTO VENDOR_IN_STATE_VIEW VALUES(1,''ABC INC。'',''AK,AL,IL'')@
SELECT * FROM VENDOR >
VENDOR_ID名称
----------- ------------------------ ------------------ --------
1 ABC INC。
1条记录被选中。
SELECT * FROM VENDOR_IN_STATE
VENDOR_ID状态
----------- -----
1 AK
1 AL
1 IL
3条记录被选中。
我使用递归来定义视图和触发器:
CREATE FUNCTION PARSE_LIST(C_LIST VARCHAR(100))
返回表(TOKEN VARCHAR(100))
SPECIFIC PARSE_LIST
返回
with PARSED_LIST(STEP,TOKEN,REMAINDER,LEFTMOST_COMMA)
AS(
VALUES (0,''',C_LIST,3)
UNION ALL
SELECT STEP + 1 AS STEP,
LEFTMOST_COMMA> 0时的情况那么
CAST(左(剩余,LEFTMOST_COMMA-1)作为CHAR(2))
ELSE
REMAINDER
结束为TOKEN,
LEFTMOST_COMMA时的情况> 0那么
CAST(SUBSTR(REMAINDER,LEFTMOST_COMMA + 1)AS VARCHAR(100))
ELSE
NULL
结束A. S REMAINDER,
LOCATE('','',SUBSTR(REMAINDER,LOCATE('','',REMAINDER)+ 1))
LEFTMOST_COMMA >
来自PARSED_LIST
WHERE REMAINDER不为空
)
从PARSED_LIST中选择TOAKEN步骤> 0
我在UDF中包装了递归查询,以便可以重复使用:
调用此表UDF非常容易:
>
SELECT * FROM TABLE(PARSE_LIST(''AK,AR,IL,OH''))AS PARSE_LIST
TOKEN
------
AK
AR
IL
OH
我在INSTEAD OF触发器中使用UDF
CREATE TRIGGER VENDOR_IN_STATE_I
INSTEAD OF INSERT
ON VENDOR_IN_STATE_VIEW >
参考新的N /
每个行模式DB2SQL
BEGIN原子
DECLARE VENDOR_FOUND INT;
SET VENDOR_FOUND =(SELECT VUNT(*)FROM VENDOR WHERE NAME = N.NAME);
IF NOT NOT(VENDOR_ FOUND> 0)
那么
INSERT INTO供应商(VENDOR_ID,NAME)价值(N.VENDOR_ID,N.NAME);
END IF ;
INSERT INTO VENDOR_IN_STATE(VENDOR_ID,STATE)
SELECT N.VENDOR_ID,PARSE_LIST.TOKEN来自
TABLE(PARSE_LIST(N.STATE_LIST)) )AS PARSE_LIST;
END @
我真的很感激任何反馈。
有没有更简单的方法?
AK< ak ************ @ yahoo.com>写道:
曾几何时有一张桌子:
CREATE TABLE VENDOR(VENDOR_ID INT,NAME VARCHAR(50),STATE CHAR(2)) @
一段时间后,开发人员意识到供应商可能出现在几个州,所以结构改为
[...]我真的很感激任何反馈。
有没有更简单的方法?
这对我来说非常直接,类似于我写的
一次:
http:/ /www-106.ibm.com/developerwork...03stolze1.html
可能更快一点的唯一选择是使用外部
表函数,用C或Java编写。这样,您可以避免在SQL中进行
递归。我真的不知道表现是否有所改善,但是,b $ b。
-
Knut Stolze
信息集成
IBM德国/耶拿大学
为什么不保持表格不变(即用逗号分隔的状态) )和
创建一个视图,其中包含不同
行的状态名称。应该更容易,因为你不需要一个而不是触发器。
视图可以像
一样创建
select vendor_id,substr(state,locate(来自
供应商的'','',州,a * 3)+1,2),表(值(1),(2),(3),(4),(5) )作为vendor_ids(a)其中a< =
长度(状态)/ 3
Knut Stolze< st **** @ de.ibm .COM>在消息新闻中写道:< bs ********** @ fsuj29.rz.uni-jena.de> ...AK< ak ****** ******@yahoo.com>写道:
曾几何时有一张桌子:
CREATE TABLE VENDOR(VENDOR_ID INT,NAME VARCHAR(50),STATE CHAR(2)) @
一段时间后,开发人员意识到供应商可能出现在xseveral状态,所以结构改为
[...]我真的很感激任何反馈。
有没有更简单的方法?
这对我来说非常直接,类似于我写的
一次:
http://www-106.ibm.com/developerwork...03stolze1.html
可能更快一点的唯一选择是使用外部
表函数,用C或Java编写。这样,您可以避免SQL中的递归。我真的不知道表现是否有所改善,但是,
。
subaga< su ****** @ yahoo.com>写道:
为什么不保持表格(即以逗号分隔状态)和
创建一个视图,其中将有不同状态的名称
行。应该更容易,因为你不需要代替触发器。
视图可以像
选择vendor_id,substr(state,locate('','',state, a * 3)+1,2)来自
供应商,表(值(1),(2),(3),(4),(5))作为vendor_ids(a)其中a< = <长度(状态)/ 3
仅在以下情况下有效:
(1)状态总是正好2个字符长 - 如果他们不是,那么
你的计算不起作用。
(2)你最多有5个州 - 好吧,你可以通过添加更多来打开它
行到vendor_ids;根据实际数据,你可能需要几行
1000行(VARCHAR最长可达32K!),这可能不会更简单
然后。
-
Knut Stolze
信息集成
IBM德国/耶拿大学
Once upon a time there was a table:
CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50), STATE CHAR(2))@
in a while the developers realized that a vendor may be present in
xseveral states, so the structure was changed to
CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50), STATE_LIST
VARCHAR(150))@
STATE)LIST could contain a list of comma-separated values like this:
CA,WA,OR
Not the best approach, but out here in the field things like this
happen all the time. Everybody were absolutely happy until somebody
required a query like this:
SELECT * FROM VENDOR WHERE STATE IN (''IL'', ''WI'',''MI'')
SELECT COUNT(*), STATE FROM VENDOR GROUP BY STATE
and so on
Given a dayabase structure developed by a skilled professional, that
would be a piece of cake, but not this time.
Moving to a better structure
CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50))@
CREATE TABLE VENDOR_IN_STATE(VENDOR_ID INT, STATE CHAR(2))@
nobody wanted to change the front end application.
As a workaround, I created a view and INSTEAD OF triggers that worked
like this:
INSERT INTO VENDOR_IN_STATE_VIEW VALUES(1, ''ABC INC.'', ''AK,AL,IL'')@
SELECT * FROM VENDOR
VENDOR_ID NAME
----------- --------------------------------------------------
1 ABC INC.
1 record(s) selected.
SELECT * FROM VENDOR_IN_STATE
VENDOR_ID STATE
----------- -----
1 AK
1 AL
1 IL
3 record(s) selected.
I used recursion to define both the view and the triggers:
CREATE FUNCTION PARSE_LIST(C_LIST VARCHAR(100))
RETURNS TABLE(TOKEN VARCHAR(100))
SPECIFIC PARSE_LIST
RETURN
WITH PARSED_LIST(STEP, TOKEN, REMAINDER,LEFTMOST_COMMA)
AS(
VALUES(0, '''',C_LIST,3)
UNION ALL
SELECT STEP+1 AS STEP,
CASE WHEN LEFTMOST_COMMA>0 THEN
CAST(LEFT(REMAINDER,LEFTMOST_COMMA-1) AS CHAR(2))
ELSE
REMAINDER
END AS TOKEN,
CASE WHEN LEFTMOST_COMMA>0 THEN
CAST(SUBSTR(REMAINDER,LEFTMOST_COMMA+1) AS VARCHAR(100))
ELSE
NULL
END AS REMAINDER,
LOCATE('','',SUBSTR(REMAINDER,LOCATE('','',REMAINDER)+ 1)) AS
LEFTMOST_COMMA
FROM PARSED_LIST
WHERE REMAINDER IS NOT NULL
)
SELECT TOKEN FROM PARSED_LIST WHERE STEP>0
I wrapped the recursive query in UDF so that it could be reused:
It is very easy to invoke this table UDF:
SELECT * FROM TABLE(PARSE_LIST(''AK,AR,IL,OH'')) AS PARSE_LIST
TOKEN
------
AK
AR
IL
OH
I user the UDF in an INSTEAD OF trigger
CREATE TRIGGER VENDOR_IN_STATE_I
INSTEAD OF INSERT
ON VENDOR_IN_STATE_VIEW
REFERENCING NEW AS N
FOR EACH ROW MODE DB2SQL
BEGIN ATOMIC
DECLARE VENDOR_FOUND INT;
SET VENDOR_FOUND=(SELECT COUNT(*) FROM VENDOR WHERE NAME=N.NAME);
IF NOT(VENDOR_FOUND>0)
THEN
INSERT INTO VENDOR(VENDOR_ID, NAME) VALUES (N.VENDOR_ID, N.NAME);
END IF;
INSERT INTO VENDOR_IN_STATE(VENDOR_ID, STATE)
SELECT N.VENDOR_ID, PARSE_LIST.TOKEN FROM
TABLE(PARSE_LIST(N.STATE_LIST)) AS PARSE_LIST;
END @
I would really appreciate any feedback.
Are there any simpler approaches?
AK <ak************@yahoo.com> wrote:
Once upon a time there was a table:
CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50), STATE CHAR(2))@
in a while the developers realized that a vendor may be present in
xseveral states, so the structure was changed to [...] I would really appreciate any feedback.
Are there any simpler approaches?
That looks pretty much straight forward to me and similar to what I wrote
once:
http://www-106.ibm.com/developerwork...03stolze1.html
The only alternative that might be a bit faster would be to use an external
table function, written in C or Java. That way, you could avoid the
recursion in SQL. I really don''t know if performance would improve or not,
however.
--
Knut Stolze
Information Integration
IBM Germany / University of Jena
why not keep the table as is (i.e. with comma separated states) and
create a view which will have the names of the states in different
rows. should be easier as u wont need a instead of trigger.
view can be created like
select vendor_id,substr(state,locate('','',state,a * 3)+1,2) from
vendor,table(values(1),(2),(3),(4),(5)) as vendor_ids(a) where a <=
length(state)/3
Knut Stolze <st****@de.ibm.com> wrote in message news:<bs**********@fsuj29.rz.uni-jena.de>...AK <ak************@yahoo.com> wrote:Once upon a time there was a table:
CREATE TABLE VENDOR(VENDOR_ID INT, NAME VARCHAR(50), STATE CHAR(2))@
in a while the developers realized that a vendor may be present in
xseveral states, so the structure was changed to
[...]I would really appreciate any feedback.
Are there any simpler approaches?
That looks pretty much straight forward to me and similar to what I wrote
once:
http://www-106.ibm.com/developerwork...03stolze1.html
The only alternative that might be a bit faster would be to use an external
table function, written in C or Java. That way, you could avoid the
recursion in SQL. I really don''t know if performance would improve or not,
however.
subaga <su******@yahoo.com> wrote:
why not keep the table as is (i.e. with comma separated states) and
create a view which will have the names of the states in different
rows. should be easier as u wont need a instead of trigger.
view can be created like
select vendor_id,substr(state,locate('','',state,a * 3)+1,2) from
vendor,table(values(1),(2),(3),(4),(5)) as vendor_ids(a) where a <=
length(state)/3
That only works if:
(1) the states are always exactly 2 characters long - if they are not, then
your calculation doesn''t work.
(2) you have at most 5 states - ok, you could open this up by adding more
rows to "vendor_ids"; depending on the actual data, you might need several
1000 rows (VARCHARs can be up to 32K long!), and that might not be simpler
then.
--
Knut Stolze
Information Integration
IBM Germany / University of Jena
这篇关于解析逗号分隔列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!