拆分多行CLOB列-Oracle PL/SQL [英] Split Multiline CLOB Column - Oracle PL/SQL

查看:135
本文介绍了拆分多行CLOB列-Oracle PL/SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表,其中包含一个CLOB字段,其值由逗号分隔的值行组成.在输出中,我希望这些行中的每行都以特定值开头.我还想高效地提取一些逗号分隔的值.

I have a table which includes a CLOB field with values consisting of lines of comma-separated values. In the output I want a row for each these lines which start with a particular value. I also want to extract some of the comma-separated values performantly.

输入表(3行):

id         my_clob
001        500,aaa,bbb
           500,ccc,ddd
           480,1,2,bad
           500,eee,fff
002        777,0,0,bad
003        500,yyy,zzz

目标输出(4行):

id         my_clob          line_num    line           second_val
001        500,aaa,bbb      1           500,aaa,bbb    aaa
           500,ccc,ddd
           480,1,2,bad
           500,eee,fff
001        500,aaa,bbb      2           500,ccc,ddd    ccc
           500,ccc,ddd
           480,1,2,bad
           500,eee,fff
001        500,aaa,bbb      3           500,eee,fff    eee
           500,ccc,ddd
           480,1,2,bad
           500,eee,fff
003        500,yyy,zzz      1           500,yyy,zzz    yyy

这与此问题非常相似,但在这种情况下,会有一个非隐藏字符可以分割.一个相关问题会删除换行符.我想对分隔这些行的任何字符进行拆分.我尝试了 chr(10)||的变体chr(13) [:space:] + 没有成功

This is very similar to this question, but in that case, there was a non-hidden character to split on. A related question removes newline characters. I'd like to split on whatever character(s) are separating these lines. I've tried variants of chr(10) || chr(13) and [:space:]+ without success

我的尝试:

SELECT 
   id
   ,my_clob
   ,level as line_num
   ,regexp_substr(my_clob,'^500,\S+', 1, level, 'm') as line 
   ,regexp_substr(
        regexp_substr(
            my_clob,'^500,\S+', 1, level, 'm'
        )
        ,'[^,]+', 1, 2
   ) as second_val
FROM tbl
CONNECT BY level <= regexp_count(my_clob, '^500,\S+', 1, 'm')
  and prior id = id
  and prior sys_guid() is not null

取决于我如何调整匹配模式,结果通常仅从my_clob的第一行得出.

The result is generally only derived from the first line in my_clob, depending on how I adjust the match pattern.

根据

'm'将源字符串视为多行.Oracle将插入符号(^)和美元符号($)分别解释为源字符串中任意行的任何行的开始和结束

'm' treats the source string as multiple lines. Oracle interprets the caret (^) and dollar sign ($) as the start and end, respectively, of any line anywhere in the source string

这些 regexp _ *(my_clob,'^ 500,\ S +',1,'m')有什么问题吗?还是更好,没有正则表达式,还有没有更高性能的方法?

Anything wrong with these regexp_*(my_clob, '^500,\S+', 1, 'm')? Or better yet, is there a more performant way without regex?

推荐答案

您可以按以下方式使用 REGEXP :

You can use the REGEXP as follows:

SQL> -- sample data
SQL> with your_data(id,myclob) as
  2  (select 1, '500,aaa,bbb
  3             500,ccc,ddd
  4             480,1,2,bad
  5             500,eee,fff' from dual)
  6  -- Your query starts from here
  7  select id, myclob, line_num, lines as line,
  8  regexp_substr(lines,'[^,]+',1,2) as second_val
  9  from
 10    (select id, myclob, column_value as line_num,
 11  trim(regexp_substr(d.myclob,'.+',1,column_value,'m')) as lines
 12     from your_data d
 13     cross join table(cast(multiset(select level from dual
 14                                 connect by  level <= regexp_count(d.myclob,'$',1,'m'))
 15     as sys.OdciNumberList)) levels)
 16  where regexp_like(lines,'^[500]');

 ID MYCLOB                    LINE_NUM LINE            SECOND_VAL
--- ------------------------- -------- --------------- ----------
  1 500,aaa,bbb                      1 500,aaa,bbb     aaa
    500,ccc,ddd
    480,1,2,bad
    500,eee,fff

  1 500,aaa,bbb                      2 500,ccc,ddd     ccc
    500,ccc,ddd
    480,1,2,bad
    500,eee,fff

  1 500,aaa,bbb                      4 500,eee,fff     eee
    500,ccc,ddd
    480,1,2,bad
    500,eee,fff


SQL>

这篇关于拆分多行CLOB列-Oracle PL/SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆