如何找到多行模式匹配(它们必须是首次匹配)? [英] How to find the multiline pattern match (they must be first time match)?
问题描述
我知道这个问题如何跨多行查找模式使用grep吗?但是我认为我的问题更加复杂.所以我需要帮助.
I know this question How to find patterns across multiple lines using grep? But I think my problem is more complicated. So I need help.
我有一个词典文件BCFile
作为
boundary
{
inlet
{
type fixedValue;
value uniform (5 0 0);
}
outlet
{
type inletOutlet;
inletValue $internalField;
value $internalField;
}
....
}
我正在编写脚本,以便打印出inlet
边界条件fixedValue
和outlet
边界条件inletOutlet
.
I am writing a script so to print out the inlet
boundary condition fixedValue
, and the outlet
boundary condition inletOutlet
.
如果我使用cat BCFile | grep "type" | awk '{printf $2}' | tr -d ";"
,它将不能正常工作,因为关键字type
出现了很多次.
If I use cat BCFile | grep "type" | awk '{printf $2}' | tr -d ";"
, it won't work as keyword type
occurs many times.
如果我使用awk -v RS='}' '/inlet/ { print $4 }' BCFile
,它也不起作用,因为关键字inlet
也会出现多次.
If I use awk -v RS='}' '/inlet/ { print $4 }' BCFile
, it won't work either, because keyword inlet
also occurs many times.
我需要一种找到模式的方法,该模式首先搜索关键字inlet
,然后搜索最近 {
和}
.
I need a way to find pattern that first search for key word inlet
and then search the closest {
and }
.
有人知道如何聪明地做到吗?
Anyone knows how to do it smartly?
推荐答案
由于您没有为发布的输入提供预期的输出,我们只是在猜测您想要的输出,但是在GNU awk中如何实现:>
Since you didn't provide expected output for the input you posted we're just guessing at what you want output but how about this in GNU awk:
$ cat tst.awk
BEGIN{ RS="\0" }
{
print "inlet:", gensub(/.*\yinlet\y[^}]*type\s+(\w+).*/,"\\1","")
print "outlet:", gensub(/.*\youtlet\y[^}]*type\s+(\w+).*/,"\\1","")
}
$ gawk -f tst.awk file
inlet: fixedValue
outlet: inletOutlet
说明:
RS="\0"
=将Record Separator设置为Null字符串,以便awk将整个文件读取为单个记录.
= set the Record Separator to the Null string so awk reads the whole file as a single record.
gensub(/.*\yinlet\y[^}]*type\s+(\w+).*/,"\\1","")
=查找单词inlet
,后跟除}
以外的所有字符(因此,您将停在inlet
之后的第一个}
之前,而不是文件中的最后一个}
),然后再搜索单词type
,然后是空白.紧随其后的字母数字字符串(\w+
)是您要打印的单词,因此请记住它,然后将整个记录替换为保存在\\1
中的那个字符串.
= look for the word inlet
followed by any characters except a }
(so you stop before the first }
after inlet
instead of the last }
in the file) and then the word type
followed by white space. The alpha-numeric string after that (\w+
) is the word you want printed so remember it and then replace the whole record with just that string as saved in \\1
.
设置RS="\0"
和gensub()
都是特定于gawk的.
Setting RS="\0"
and gensub()
are both gawk-specific.
这篇关于如何找到多行模式匹配(它们必须是首次匹配)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!