分割字符串(例如使用bash),但跳过它的一部分 [英] split string (e.g. with bash) but skip part of it
问题描述
我如何使用bash分裂(AWK,sed的,不管)以下字符串:
在
A,B,[C,D],电子
输出:
A
b
[C,D]
Ë
尝试1)
$ IFS =','读-a令牌<<< A,B,[C,D],E;回声$ {令牌[@]}
A B [C D我们]
尝试2)
$ IFS =','
$线=A,B,[C,D],E
$ EVAL X =($行)
$回声$ {X [1]}
b
$回声$ {X [0]}
一个
$回声$ {X [2]}
[C D]但不是 ','!
这是从报价以外的人识别逗号,引号里的不同,以一般CSV只是问题的一个具体实例,与其他一些字符替换任何一个(例如:;
)。惯用的awk解决这个(除了在GNU AWK使用FPAT)是:
替换引号内:
$回声A,B,C,D,E'| AWK'BEGIN {FS = OFS =\\} {对于(ⅰ= 2; I&下; = NF; I + = 2)GSUB(/,/,;,$ⅰ)} 1'
A,B,C,D,E
替换引号外:
$回声A,B,C,D,E'| AWK'BEGIN {FS = OFS =\\} {对于(ⅰ= 1; I&下; = NF; I + = 2)GSUB(/,/,;,$ⅰ)} 1'
A; B;C,D,E
在您的情况下,分隔符是 [...]
而不是...
和置换性格是一个换行符,而不是一个分号,但它本质上是同样的问题:
替换引号外(方括号中):
$回声'A,B,[C,D],E'| awk的'BEGIN {FS =[] []; OFS =} {为(i = 1; I< = NF;我+ = 2)GSUB(/,/\\ n,$ I)} 1'
一个
b
C,D
Ë
注意,方括号都没有了,因为我设置OFS一个空白字符,因为没有使用1张FS字符。你可以让他们回来这个,如果你真正做需要他们:
$回声'A,B,[C,D],E'| awk的'BEGIN {FS =[] []; OFS =} {为(i = 1; I< = NF;我++),如果(I%2)GSUB(/,/\\ n,$ I);否则$ I =[$ I]} 1'
一个
b
[C,D]
Ë
但机会是你不这样做,因为他们的目的是包含逗号和现在,这是由新行作为字段分隔符,而不是逗号处理组文本。
How can I split with bash (awk, sed, whatever) the following string:
in:
a,b,[c, d],e
output:
a
b
[c, d]
e
try 1)
$IFS=',' read -a tokens <<< "a,b,[c, d], e"; echo ${tokens[@]}
a b [c d] e
try 2)
$ IFS=','
$ line="a,b,[c, d], e"
$ eval x=($line)
$ echo ${x[1]}
b
$ echo ${x[0]}
a
$ echo ${x[2]}
[c d]
But not ','!
This is just a specific instance of the general CSV problem of identifying commas inside quotes differently from those outside of quotes in order to replace either one with some other character (e.g. ;
). The idiomatic awk solution to that (besides using FPAT in GNU awk) is:
Replace inside the quotes:
$ echo 'a,b,"c, d",e' | awk 'BEGIN{FS=OFS="\""} {for (i=2;i<=NF;i+=2) gsub(/,/,";",$i)}1'
a,b,"c; d",e
Replace outside the quotes:
$ echo 'a,b,"c, d",e' | awk 'BEGIN{FS=OFS="\""} {for (i=1;i<=NF;i+=2) gsub(/,/,";",$i)}1'
a;b;"c, d";e
In your case the delimiters are [...]
instead of "..."
and the replacement character is a newline instead of a semi-colon but it's essentially the same problem:
Replace outside the "quotes" (square brackets):
$ echo 'a,b,[c, d],e' | awk 'BEGIN{FS="[][]"; OFS=""} {for (i=1;i<=NF;i+=2) gsub(/,/,"\n",$i)}1'
a
b
c, d
e
Note that the square brackets are gone because I set OFS to a blank char since there is no 1 single FS character to use. You can get them back with this if you actually do need them:
$ echo 'a,b,[c, d],e' | awk 'BEGIN{FS="[][]"; OFS=""} {for (i=1;i<=NF;i++) if (i%2) gsub(/,/,"\n",$i); else $i="["$i"]"}1'
a
b
[c, d]
e
but chances are you don't since their purpose was to group text that contained commas and now that's handled by the newlines being the field separators instead of commas.
这篇关于分割字符串(例如使用bash),但跳过它的一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!