awk忽略括号内单引号内的定界符 [英] awk ignore delimiter inside single quote within a parenthesis
问题描述
我在csv中有一组数据,如下所示:
I have a set of data inside the csv as below:
Given Data:
(12,'hello','this girl,is lovely(adorable \r\n actually)',goodbye),
(13,'hello','this fruit,is super tasty (sweet actually)',goodbye)
我想将给定的数据打印到从(直到)开始的两行中,并忽略定界符和(
I want to print the given data into 2 rows starting from ( till ) and ignore delimiter , and () inside the ' ' field.
如何在Linux中使用awk或sed做到这一点?
How can I do this using awk or sed in linux?
期望结果如下:
Expected Result:
row 1 = 12,'hello','this girl,is lovely(adorable actually)',goodbye
row 2 = 13,'hello','this fruit,is super tasty (sweet actually)',goodbye
更新:
我只是注意到两行之间有一个逗号。因此,如何使用,之后和之前(?)将其分为两行?
UPDATE: I just noticed that there are a comma between the 2 rows. So how can i separate it into 2 rows using the , after ) and before (?
推荐答案
您可以使用以下 awk
命令来实现您的目标:
You can use the following awk
command to achieve your goal:
awk -i.bak '{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}' file.in
对您的输入进行了测试:
说明:
-
-i.bak
将备份您的文件和 -
{str = substr($ 0,2,length($ 0)-2); gsub( \r?| \ \\\\n?,,str);打印 row NR = str;}
首先会删除字符串的第一个和最后一个括号,然后再删除\r
,\n
并以所需的格式打印 - 如果标题为
NR>,则可能需要在
->{...}
之前添加以下条件: 1'NR> 1 {str = substr($ 0,2,length($ 0)-2); gsub( \r?| \n?,,str);打印 row NR = str;}'
-i.bak
will take a backup of your file and{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}
will first remove the first and last parenthesis of your string before removing the\r
,\n
and printing it in the format you want- you might need to add before the
{...}
the following condition if you have a headerNR>1
->'NR>1{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}'
要求,我对awk命令进行了调整,使其能够将您的,
作为记录分隔符(行分隔符)
following the changes in your requirements, I have adapted the awk command to be able to take into account your ,
as a record separator (row separator)
awk -i.bak 'BEGIN{RS=",\n|\n"}{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}' file.in
中打印 row NR = str;} file.in BEGIN {RS =,\n | \n}
定义行分隔符约束
where BEGIN{RS=",\n|\n"}
defines your row separator constraint
这篇关于awk忽略括号内单引号内的定界符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!