awk:拆分为"\ n" [英] awk: Split on "\n"
问题描述
我正在尝试处理一个日志文件,其中的条目被压缩为一行,换行符编码为"\ n".我想将所有内容都保留到第一个"\ n",其余的都丢弃. awk -F"\ n"'{print $ 1}'文件
不起作用, awk -F"\\ n"'{print $ 1}'文件
.该命令的正确形式是什么?
I'm trying to process a log file in which entries are compressed into one line with the newline encoded as "\n". I want to keep everything up to the first "\n" and discard the rest. awk -F"\n" '{print $1}' file
doesn't work, and neither does awk -F"\\n" '{print $1}' file
. What's the correct form of this command?
推荐答案
$ echo 'a\nb'
a\nb
$ echo 'a\nb' | awk -F'\\\\n' '{print $1}'
a
原因:在正则表达式比较中考虑上述字符的这些用法:
Here's why: Consider these uses of the above characters in regexp comparisons:
-
n
=文字字符n
($ 0〜/n/
) -
\ n
=文字换行符($ 0〜/\ n/
) -
\\
=在正则表达式常量($ 0〜/\\/
)中使用反斜杠 -
\\\\
=在动态正则表达式中使用反斜杠($ 0〜"\\\\"
)
n
= the literal charactern
($0 ~ /n/
)\n
= a literal newline character ($0 ~ /\n/
)\\
= a backslash when used in a regexp constant ($0 ~ /\\/
)\\\\
= a backslash when used in a dynamic regexp ($0 ~ "\\\\"
)
最后一个是因为动态正则表达式是一个字符串,必须将其解析一次才能转换为正则表达式,然后在用作该正则表达式时再次进行解析,因此,由于它被解析了两次,因此需要将所有转义符加倍
That last one is because a dynamic regexp is a string which has to be parsed once to be converted to a regexp and then gets parsed again when used as that regexp, so since it gets parsed twice it needs all escapes to be doubled.
由于当您说 -F无论如何"
时,字段分隔符基本上是一个正则表达式(有一些曲折),因此您将FS变量定义为动态正则表达式,因此转义必须加倍
Since a field separator is basically a regexp (with a few twists) when you say -F "whatever"
you are defining the FS variable to be a dynamic regexp and so escapes have to be doubled.
这篇关于awk:拆分为"\ n"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!