将字符串分成几行 [英] split string into several rows
问题描述
我有一个表,该表的字符串包含几个定界的值,例如a;b;c
.
I have a table with a string which contains several delimited values, e.g. a;b;c
.
我需要拆分此字符串并在查询中使用其值.例如,我有下表:
I need to split this string and use its values in a query. For example I have following table:
str
a;b;c
b;c;d
a;c;d
我需要对str
列中的单个值进行分组才能获得以下结果:
I need to group by a single value from str
column to get following result:
str count(*)
a 1
b 2
c 3
d 2
是否可以使用单个选择查询来实现?我无法创建临时表以在其中提取值并针对该临时表进行查询.
Is it possible to implement using single select query? I can not create temporary tables to extract values there and query against that temporary table.
推荐答案
从您的评论到@PrzemyslawKruglej answer >
From your comment to @PrzemyslawKruglej answer
主要问题是使用
connect by
进行内部查询,它会产生惊人的行数
Main problem is with internal query with
connect by
, it generates astonishing amount of rows
可以通过以下方法减少生成的行数:
The amount of rows generated can be reduced with the following approach:
/* test table populated with sample data from your question */
SQL> create table t1(str) as(
2 select 'a;b;c' from dual union all
3 select 'b;c;d' from dual union all
4 select 'a;c;d' from dual
5 );
Table created
-- number of rows generated will solely depend on the most longest
-- string.
-- If (say) the longest string contains 3 words (wont count separator `;`)
-- and we have 100 rows in our table, then we will end up with 300 rows
-- for further processing , no more.
with occurrence(ocr) as(
select level
from ( select max(regexp_count(str, '[^;]+')) as mx_t
from t1 ) t
connect by level <= mx_t
)
select count(regexp_substr(t1.str, '[^;]+', 1, o.ocr)) as generated_for_3_rows
from t1
cross join occurrence o;
结果:对于最长的一行由三个单词组成的三行,我们将生成9行:
GENERATED_FOR_3_ROWS
--------------------
9
最终查询:
with occurrence(ocr) as(
select level
from ( select max(regexp_count(str, '[^;]+')) as mx_t
from t1 ) t
connect by level <= mx_t
)
select res
, count(res) as cnt
from (select regexp_substr(t1.str, '[^;]+', 1, o.ocr) as res
from t1
cross join occurrence o)
where res is not null
group by res
order by res;
结果:
RES CNT
----- ----------
a 2
b 2
c 3
d 2
详细了解 regexp_count()(11g及更高版本)和 regexp_substr()正则表达式函数.
Find out more about regexp_count()(11g and up) and regexp_substr() regular expression functions.
注意:正则表达式函数的计算成本相对较高,并且在处理大量数据时,可能值得考虑切换到纯PL/SQL. 这是一个例子.
Note: Regular expression functions relatively expensive to compute, and when it comes to processing a very large amount of data, it might be worth considering to switch to a plain PL/SQL. Here is an example.
这篇关于将字符串分成几行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!