管道符号|在AWK字段分隔符 [英] Pipe symbol | in AWK field delimiter

查看:663
本文介绍了管道符号|在AWK字段分隔符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文件具有以下数据:

I have a file foo that has the following data:

A<|>B<|>C<|>D
1<|>2<|>3<|>4

我想用awk正常访问每一列,但它不是正确间preting字段分隔符。

I want to properly access each column using awk, but it isn't properly interpreting the field separator.

当我运行:

head foo | \
  awk 'BEGIN {FS="<|>"} {out=""; for(i=1;i<=NF;i++){out=out" "$i}; print out}'

而不是印刷

A B C D
1 2 3 4

它打印

A | B | C | D 
1 | 2 | 3 | 4

什么是背后的原因呢?

What's the reason behind this?

推荐答案

管道是一个正则表达式特殊字符,所以你需要用反斜线转义。但是,这反斜线也是对字符串特殊字符,所以它需要再次逃脱。所以,你最终以下内容:

The pipe is a special character in a regex, so you need to escape it with a backslash. But this backslash is also a special character for the string literal, so it needs to be escaped again. So you end up with the following:

awk -F '<\\|>' '{$1=$1}1'

awk 'BEGIN {FS="<\\|>"} {$1=$1}1' 

这样做的原因语法解释相当不错的位置: HTTP ://www.gnu.org/software/gawk/manual/gawk.html#Computed-Regexps 。总之,前pression被解析两次。

The reason for this syntax is explained quite well here: http://www.gnu.org/software/gawk/manual/gawk.html#Computed-Regexps. In short, the expression is parsed twice.

这篇关于管道符号|在AWK字段分隔符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆