匹配的AWK多行正则表达式。 &功放;&安培;运营商? [英] Matching regex of multiple lines in AWK. && operator?
问题描述
我不知道,如果和放大器;&安培;运营商工作在常规的前pressions。我所试图做的是匹配行,使得它以数字开头并以字母A和下一行开始与一些并以字母'B'和下一行...字母C 。此ABC顺序将被用作唯一标识符开始读取文件。</ P>
下面就是我的排序awk中去为。
/(^ [0-9] + *一。)及和放大器; \\ n(^ [0-9] + * B。)及和放大器; \\ N( ^ [0-9] +。* C){
打印$ 0个
}
这些正则表达式的作品就像刚一(^ [0-9] + *一个),但我不知道下一行如何把它们串用在一起,是这样的。
我的文件会是这样:
垃圾在这里不带编号开始
1 0.110 0.069
2 0.062 0.088
3 0.062 0.121
4 B 0.062 0.121
5℃0.032 0.100
6天0.032 0.100
[7] E 0.032 0.100
和我要的是:
3 0.062 0.121
4 B 0.062 0.121
5℃0.032 0.100
6天0.032 0.100
[7] E 0.032 0.100
[根据澄清更新。]
一高位是awk的是面向行的语言,所以你不会真正能够做一个正常的模式匹配跨线。通常的方法做这样的事情是每行单独匹配,并有后来条款/声明弄清楚,如果所有正确的部分已经被匹配。
我在做什么这里寻找一个 A
在同一行的第二场,一个 B
在第二场另一条线路上的第二个字段,以及 C
在第三行。在前两种情况下,我藏匿走行的内容,以及它发生的行号上。当第三行是匹配的,我们还没有发现整个序列,我回去检查,看看是否其他两条线present和接受的行号。如果所有的好,我打印出来的缓冲previous线并设置一个标志,指示一切应打印。
下面的脚本:
$ 2 ==是{a = $ 0;艾琳= NR; }
$ 2 ==B{B = $ 0; bLine = NR; }
$ 2 ==C和放大器;&安培; !keepPrinting {
如果((bLine ==(NR - 1))及及(艾琳==(NR - 2))){
打印;
印片B;
keepPrinting = 1;
}
}
keepPrinting {打印; }
而这里的文件我测试了它:
垃圾在这里不带编号开始
1 0.110 0.069
2 0.062 0.088
3 0.062 0.121
4 B 0.062 0.121
5℃0.032 0.100
6天0.032 0.100
[7] E 0.032 0.100
8 0.099 0.121
9 B 0.098 0.121
10℃0.097 0.100
11×0.000 0.200
下面就是我得到的,当我运行它:
$ AWK -f blort.awk blort.txt
3 0.062 0.121
4 B 0.062 0.121
5℃0.032 0.100
6天0.032 0.100
[7] E 0.032 0.100
8 0.099 0.121
9 B 0.098 0.121
10℃0.097 0.100
11×0.000 0.200
I am not sure if the && operator works in regular expressions. What I am trying to do is match a line such that it starts with a number and has the letter 'a' AND the next line starts with a number and has the letter 'b' AND the next line... letter 'c'. This abc sequence will be used as a unique identifier to start reading the file.
Here is what I am sort of going for in awk.
/(^[0-9]+ .*a)&&\n(^[0-9]+ .*b)&&\n(^[0-9]+ .*c) {
print $0
}
Just one of these regex works like (^[0-9]+ .*a), but I am not sure how to string them together with AND THE NEXT LINE IS THIS.
My file would be like:
JUNK UP HERE NOT STARTING WITH NUMBER
1 a 0.110 0.069
2 a 0.062 0.088
3 a 0.062 0.121
4 b 0.062 0.121
5 c 0.032 0.100
6 d 0.032 0.100
7 e 0.032 0.100
And what I want is:
3 a 0.062 0.121
4 b 0.062 0.121
5 c 0.032 0.100
6 d 0.032 0.100
7 e 0.032 0.100
[Update based on clarification.]
One high order bit is that Awk is a line-oriented language, so you won't actually be able to do a normal pattern match to span lines. The usual way to do something like this is to match each line separately, and have a later clause / statement figure out if all the right pieces have been matched.
What I'm doing here is looking for an a
in the second field on one line, a b
in the second field on another line, and a c
in the second field on a third line. In the first two cases, I stash away the contents of the line as well as what line number it occurred on. When the third line is matched and we haven't yet found the whole sequence, I go back and check to see if the other two lines are present and with acceptable line numbers. If all's good, I print out the buffered previous lines and set a flag indicating that everything else should print.
Here's the script:
$2 == "a" { a = $0; aLine = NR; }
$2 == "b" { b = $0; bLine = NR; }
$2 == "c" && !keepPrinting {
if ((bLine == (NR - 1)) && (aLine == (NR - 2))) {
print a;
print b;
keepPrinting = 1;
}
}
keepPrinting { print; }
And here's a file I tested it with:
JUNK UP HERE NOT STARTING WITH NUMBER
1 a 0.110 0.069
2 a 0.062 0.088
3 a 0.062 0.121
4 b 0.062 0.121
5 c 0.032 0.100
6 d 0.032 0.100
7 e 0.032 0.100
8 a 0.099 0.121
9 b 0.098 0.121
10 c 0.097 0.100
11 x 0.000 0.200
Here's what I get when I run it:
$ awk -f blort.awk blort.txt
3 a 0.062 0.121
4 b 0.062 0.121
5 c 0.032 0.100
6 d 0.032 0.100
7 e 0.032 0.100
8 a 0.099 0.121
9 b 0.098 0.121
10 c 0.097 0.100
11 x 0.000 0.200
这篇关于匹配的AWK多行正则表达式。 &功放;&安培;运营商?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!