分割字符串(例如使用bash),但跳过它的一部分 [英] split string (e.g. with bash) but skip part of it

查看:159
本文介绍了分割字符串(例如使用bash),但跳过它的一部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何使用bash分裂(AWK,sed的,不管)以下字符串:

  A,B,[C,D],电子

输出:

  A
b
[C,D]
Ë

尝试1)

  $ IFS =','读-a令牌<<< A,B,[C,D],E;回声$ {令牌[@]}
A B [C D我们]

尝试2)

  $ IFS =','
$线=A,B,[C,D],E
$ EVAL X =($行)
$回声$ {X [1]}
b
$回声$ {X [0]}
一个
$回声$ {X [2]}
[C D]但不是 ','!


解决方案

这是从报价以外的人识别逗号,引号里的不同,以一般CSV只是问题的一个具体实例,与其他一些字符替换任何一个(例如:; )。惯用的awk解决这个(除了在GNU AWK使用FPAT)是:

替换引号内:

  $回声A,B,C,D,E'| AWK'BEGIN {FS = OFS =\\} {对于(ⅰ= 2; I&下; = NF; I + = 2)GSUB(/,/,;,$ⅰ)} 1'
A,B,C,D,E

替换引号外:

  $回声A,B,C,D,E'| AWK'BEGIN {FS = OFS =\\} {对于(ⅰ= 1; I&下; = NF; I + = 2)GSUB(/,/,;,$ⅰ)} 1'
A; B;C,D,E

在您的情况下,分隔符是 [...] 而不是...和置换性格是一个换行符,而不是一个分号,但它本质上是同样的问题:

替换引号外(方括号中):

  $回声'A,B,[C,D],E'| awk的'BEGIN {FS =[] []; OFS =} {为(i = 1; I< = NF;我+ = 2)GSUB(/,/\\ n,$ I)} 1'
一个
b
C,D
Ë

注意,方括号都没有了,因为我设置OFS一个空白字符,因为没有使用1张FS字符。你可以让他们回来这个,如果你真正做需要他们:

  $回声'A,B,[C,D],E'| awk的'BEGIN {FS =[] []; OFS =} {为(i = 1; I< = NF;我++),如果(I%2)GSUB(/,/\\ n,$ I);否则$ I =[$ I]} 1'
一个
b
[C,D]
Ë

但机会是你不这样做,因为他们的目的是包含逗号和现在,这是由新行作为字段分隔符,而不是逗号处理组文本。

How can I split with bash (awk, sed, whatever) the following string:

in:

a,b,[c, d],e

output:

a
b
[c, d]
e

try 1)

$IFS=',' read -a tokens <<< "a,b,[c, d], e"; echo ${tokens[@]}
a b [c d] e

try 2)

$ IFS=',' 
$ line="a,b,[c, d], e"
$ eval x=($line)
$ echo ${x[1]}
b
$ echo ${x[0]}
a
$ echo ${x[2]}
[c  d]

But not ','!

解决方案

This is just a specific instance of the general CSV problem of identifying commas inside quotes differently from those outside of quotes in order to replace either one with some other character (e.g. ;). The idiomatic awk solution to that (besides using FPAT in GNU awk) is:

Replace inside the quotes:

$ echo 'a,b,"c, d",e' | awk 'BEGIN{FS=OFS="\""} {for (i=2;i<=NF;i+=2) gsub(/,/,";",$i)}1'
a,b,"c; d",e

Replace outside the quotes:

$ echo 'a,b,"c, d",e' | awk 'BEGIN{FS=OFS="\""} {for (i=1;i<=NF;i+=2) gsub(/,/,";",$i)}1'
a;b;"c, d";e

In your case the delimiters are [...] instead of "..." and the replacement character is a newline instead of a semi-colon but it's essentially the same problem:

Replace outside the "quotes" (square brackets):

$ echo 'a,b,[c, d],e' | awk 'BEGIN{FS="[][]"; OFS=""} {for (i=1;i<=NF;i+=2) gsub(/,/,"\n",$i)}1'
a
b
c, d
e

Note that the square brackets are gone because I set OFS to a blank char since there is no 1 single FS character to use. You can get them back with this if you actually do need them:

$ echo 'a,b,[c, d],e' | awk 'BEGIN{FS="[][]"; OFS=""} {for (i=1;i<=NF;i++) if (i%2) gsub(/,/,"\n",$i); else $i="["$i"]"}1'
a
b
[c, d]
e

but chances are you don't since their purpose was to group text that contained commas and now that's handled by the newlines being the field separators instead of commas.

这篇关于分割字符串(例如使用bash),但跳过它的一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆