使用变量中的数据时,如何在awk中使用单词边界 [英] How to use word boundary in awk when using data from variable

查看:134
本文介绍了使用变量中的数据时,如何在awk中使用单词边界的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在awk

var="blue"

cat file
test
blue more
bluegrass not
yes red
more blue
fine blue, not

我只需要或多或少带有blue的行.

I need only lines with blue, more or less.

如果我这样做:

awk '/\<blue\>/' file
blue more
more blue
fine blue, not

我得到了所需的输出(但这没有使用变量).

I get the output what I need (but this is without using variable).


但是如何使用变量呢?


But how to do this with a variable?

这是我的一些测试:

awk '$0~"\<"test"\>"' test="$var" file
awk '$0~/\</test/\>/' test="$var" file
awk '{a="\<"test"\>"} $0~a' test="$var" file

所有这些都失败.

仅需awk,因为这是较大测试的一部分.

Needs awk only, since this is part of a larger test.

更新.
看来我的某些变量确实包含+唱歌.这会制止Ed

Update.
It seems that some of my variable does contains a + sing. This brakes the solution from Ed

var="blue+"

cat file
test
blue+green more
bluegrass not
yes red
more blue+
fine blue+, not

awk -v test="$var" '$0~"\\<"test"\\>"' file
blue+green more
more blue+
fine blue+, not

推荐答案

awk -v test="$var" '$0~"\\<"test"\\>"' tfile

记住在regexp上下文中使用的字符串会被解析两次,一次是在读取时,一次是在执行时,因此,如果需要转义,则需要将所有内容转义两次.

Remember strings used in regexp contexts get parsed twice, once when read and again when executed, so you need to escape everything twice if it needs to be escaped.

还请注意,\<仅适用于gawk.

Also note that \< is gawk-only.

鉴于您要搜索的文本可能包含您需要的RE元字符,因此提供了更新的信息

Given the updated info that the text you want to search for can contain RE metacharacters you need to either

  1. 转义所有可能出现在文本中的RE元字符,或者
  2. 将其作为字符串处理

如果您只需要在特定上下文中考虑几个问题,转义RE元字符将变得微不足道,而且我敢肯定您可以弄清楚这一点,但是由于字符的上下文相关特性,通常很难(不可能?)因此,我将重点介绍如何检测不属于较长单词"的字符串:

Escaping RE metacharaters is trivial if you only have a couple in specific contexts to worry about and I'm sure you can figure that out, but is difficult (impossible?) in general due to the context-sensitive nature of the characters so I'll focus on how to detect a string that's not part of a longer "word":

awk -v test="$var" '
    (s=index($0,test)) &&                            # test exists and is neither
    ((s>1?substr($0,s-1,1):"") !~ /[[:alnum:]_]/) && # preceded by a word char nor
    (substr($0,s+length(test),1) !~ /[[:alnum:]_]/)  # succeeded by a word char
'

这篇关于使用变量中的数据时,如何在awk中使用单词边界的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆