如何使用双引号从csv中跳过逗号 [英] How to skip comma from csv using double quotes
问题描述
在从csv文件读取数据后,我得到了以下格式的字符串
I am getting a string in the below format after reading data from a csv file
v_lastline ='29218368,8062115,Benedict Canyon Equals,Inc,CLS ,, FAX';
我只想将其转换为数组,之前,Inc
需要转义。
I just want to convert it into an array while will contain 6 values, the comma before the , Inc
needs to be escaped.
Can any one please suggest whats the best way to do it in PL/SQL?
推荐答案
这是类似于此问题,但您的列表中有空元素;并且我尝试的模式之一的简单翻译跳过那些:
This is similar to this question, but you have empty elements in your list; and a simple translation of one of the patterns I tried there skips those:
var v_lastline varchar2(50);
exec :v_lastline := '29218368,8062115," Benedict Canyon Equities, Inc",CLS,,FAX';
select level as lvl,
regexp_substr(:v_lastline, '("[^"]*"|[^,]+)', 1, level) as element
from dual
connect by level <= regexp_count(:v_lastline, '("[^"]*"|[^,]+)');
LVL ELEMENT
---------- ----------------------------------------
1 29218368
2 8062115
3 " Benedict Canyon Equities, Inc"
4 CLS
5 FAX
如果你能识别一个不会出现在数据中的特殊字符,那么你可以解决通过将每个逗号更改为逗号+字符,然后在分割后将其删除,将其放入空元素中:
If you can identify a special character that will never appear in the data then you can work around that by putting that into the empty elements by changing every comma to comma+character, and then removing it after the split:
select level as lvl,
replace(regexp_substr(replace(:v_lastline, ',', ',§'),
'(§"[^"]*"|[^,]+)', 1, level), '§', null) as element
from dual
connect by regexp_substr(replace(:v_lastline, ',', ',§'),
'(§"[^"]*"|[^,]+)', 1, level) is not null;
LVL ELEMENT
---------- ----------------------------------------
1 29218368
2 8062115
3 " Benedict Canyon Equities, Inc"
4 CLS
5
6 FAX
这是分割分隔字符串的常用方法的扩展,这里详细解释。
It's an extension of a common method to split delimited strings, which is explained in detail here.
-
replace(:v_lastline,',',',§')
更改...,CLS ,, FAX
code> ...,§CLS,§,§FAX,其中§
是一个你永远不会看到的字符。 li>
-
regexp_substr(...,'(`[^] *| [^,] +)',1,level)
用正则表达式对更新的值进行标记,查找任何双引号括起的值(现在前面带有特殊字符)或非逗号; -
级别
是分层查询语法的一部分,其中: -
connect by regexp_substr(< same value and pattern>)不为null
只是计算出有多少个令牌。 li>
- 最后
replace(regexp_substr(...),,'§',null)
删除第一步中使用的特殊字符。
replace(:v_lastline, ',', ',§')
changes...,CLS,,FAX
to...,§CLS,§,§FAX
, where§
is a character you'll never see.regexp_substr(..., '(§"[^"]*"|[^,]+)', 1, level)
tokenises the updated value with a regex that looks for any double-quote-enclosed value (now preceded by the special character too) or a non-comma; the order of the evaluation means commas inside the quoted part are ignored.- the
level
is part of the hierarchical query syntax, where: connect by regexp_substr(<same value and pattern>) is not null
just figured out how many tokens there are.- and finally
replace(regexp_substr(...), , '§', null)
removes the special character used in the first step.
然后,您可以使用 replace()
,并修剪空格,如果你想/需要。
You can then remove the double-quotes too with a further level of replace()
, and trim whitespace, if you want/need to.
你没有说你的意思是一个数组,但你可以运行该查询PL / SQL并批量收集到集合中,如果这是您打算使用。例如,使用内置的ODCIVARCHAR2LIST集合类型:
You have't said quite what you mean by an array, but you can run that query in PL/SQL and bulk-collect into a collection if that's what you intend to work with. For example, using the built-in ODCIVARCHAR2LIST collection type:
set serveroutput on
declare
v_lastline varchar2(50);
v_array sys.odcivarchar2list;
begin
v_lastline := '29218368,8062115," Benedict Canyon Equities, Inc",CLS,,FAX';
select trim(replace(replace(
regexp_substr(replace(:v_lastline, ',', ',§'),
'(§"[^"]*"|[^,]+)', 1, level), '§', null), '"', null))
bulk collect into v_array
from dual
connect by regexp_substr(replace(:v_lastline, ',', ',§'),
'(§"[^"]*"|[^,]+)', 1, level) is not null;
dbms_output.put_line('Number of elements: ' || v_array.count);
for i in 1..v_array.count loop
dbms_output.put_line('Index ' || i || ' has: ' || v_array(i));
end loop;
end;
/
Number of elements: 6
Index 1 has: 29218368
Index 2 has: 8062115
Index 3 has: Benedict Canyon Equities, Inc
Index 4 has: CLS
Index 5 has:
Index 6 has: FAX
工程:
With multiple empty elements this also (now) works:
exec :v_lastline := '29218368,8062115," Benedict Canyon Equities, Inc",,,,,,,CLS,,,,,FAX,,,,,,,,,,,,,,,,,,INVOICE';
select level as lvl,
replace(regexp_substr(replace(:v_lastline, ',', ',§'),
'(§"[^"]*"|[^,]+)', 1, level), '§', null) as element
from dual
connect by regexp_substr(replace(:v_lastline, ',', ',§'),
'(§"[^"]*"|[^,]+)', 1, level) is not null;
LVL ELEMENT
---------- ----------------------------------------
1 29218368
2 8062115
3 " Benedict Canyon Equities, Inc"
4
...
9
10 CLS
11
...
14
15 FAX
16
...
32
33 INVOICE
这篇关于如何使用双引号从csv中跳过逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!