使用变量中的数据时,如何在awk中使用单词边界 [英] How to use word boundary in awk when using data from variable
问题描述
我在awk
var="blue"
cat file
test
blue more
bluegrass not
yes red
more blue
fine blue, not
我只需要或多或少带有blue
的行.
I need only lines with blue
, more or less.
如果我这样做:
awk '/\<blue\>/' file
blue more
more blue
fine blue, not
我得到了所需的输出(但这没有使用变量).
I get the output what I need (but this is without using variable).
但是如何使用变量呢?
But how to do this with a variable?
这是我的一些测试:
awk '$0~"\<"test"\>"' test="$var" file
awk '$0~/\</test/\>/' test="$var" file
awk '{a="\<"test"\>"} $0~a' test="$var" file
所有这些都失败.
仅需awk
,因为这是较大测试的一部分.
Needs awk
only, since this is part of a larger test.
更新.
看来我的某些变量确实包含+
唱歌.这会制止Ed
Update.
It seems that some of my variable does contains a +
sing. This brakes the solution from Ed
var="blue+"
cat file
test
blue+green more
bluegrass not
yes red
more blue+
fine blue+, not
awk -v test="$var" '$0~"\\<"test"\\>"' file
blue+green more
more blue+
fine blue+, not
推荐答案
awk -v test="$var" '$0~"\\<"test"\\>"' tfile
记住在regexp上下文中使用的字符串会被解析两次,一次是在读取时,一次是在执行时,因此,如果需要转义,则需要将所有内容转义两次.
Remember strings used in regexp contexts get parsed twice, once when read and again when executed, so you need to escape everything twice if it needs to be escaped.
还请注意,\<
仅适用于gawk.
Also note that \<
is gawk-only.
鉴于您要搜索的文本可能包含您需要的RE元字符,因此提供了更新的信息
Given the updated info that the text you want to search for can contain RE metacharacters you need to either
- 转义所有可能出现在文本中的RE元字符,或者
- 将其作为字符串处理
如果您只需要在特定上下文中考虑几个问题,转义RE元字符将变得微不足道,而且我敢肯定您可以弄清楚这一点,但是由于字符的上下文相关特性,通常很难(不可能?)因此,我将重点介绍如何检测不属于较长单词"的字符串:
Escaping RE metacharaters is trivial if you only have a couple in specific contexts to worry about and I'm sure you can figure that out, but is difficult (impossible?) in general due to the context-sensitive nature of the characters so I'll focus on how to detect a string that's not part of a longer "word":
awk -v test="$var" '
(s=index($0,test)) && # test exists and is neither
((s>1?substr($0,s-1,1):"") !~ /[[:alnum:]_]/) && # preceded by a word char nor
(substr($0,s+length(test),1) !~ /[[:alnum:]_]/) # succeeded by a word char
'
这篇关于使用变量中的数据时,如何在awk中使用单词边界的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!