匹配的AWK多行正则表达式。＆功放;＆安培;运营商？ [英] Matching regex of multiple lines in AWK. && operator?

查看：138 发布时间：2016/7/28 16:48:43 python regex parsing awk

本文介绍了匹配的AWK多行正则表达式。＆功放;＆安培;运营商？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我不知道，如果和放大器;＆安培;运营商工作在常规的前pressions。我所试图做的是匹配行，使得它以数字开头并以字母A和下一行开始与一些并以字母'B'和下一行...字母C 。此ABC顺序将被用作唯一标识符开始读取文件。</ P>

下面就是我的排序awk中去为。

  /（^ [0-9] + *一。）及和放大器; \\ n（^ [0-9] + * B。）及和放大器; \\ N（ ^ [0-9] +。* C）{
打印$ 0个
}

这些正则表达式的作品就像刚一（^ [0-9] + *一个），但我不知道下一行如何把它们串用在一起，是这样的。

我的文件会是这样：

 垃圾在这里不带编号开始
1 0.110 0.069
2 0.062 0.088
3 0.062 0.121
4 B 0.062 0.121
5℃0.032 0.100
6天0.032 0.100
[7] E 0.032 0.100

和我要的是：

  3 0.062 0.121
4 B 0.062 0.121
5℃0.032 0.100
6天0.032 0.100
[7] E 0.032 0.100

解决方案

[根据澄清更新。]

一高位是awk的是面向行的语言，所以你不会真正能够做一个正常的模式匹配跨线。通常的方法做这样的事情是每行单独匹配，并有后来条款/声明弄清楚，如果所有正确的部分已经被匹配。

我在做什么这里寻找一个 A 在同一行的第二场，一个 B 在第二场另一条线路上的第二个字段，以及 C 在第三行。在前两种情况下，我藏匿走行的内容，以及它发生的行号上。当第三行是匹配的，我们还没有发现整个序列，我回去检查，看看是否其他两条线present和接受的行号。如果所有的好，我打印出来的缓冲previous线并设置一个标志，指示一切应打印。

下面的脚本：

  $ 2 ==是{a = $ 0;艾琳= NR; }
$ 2 ==B{B = $ 0; bLine = NR; }
$ 2 ==C和放大器;＆安培; ！keepPrinting {
    如果（（bLine ==（NR  -  1））及及（艾琳==（NR  -  2）））{
        打印;
        印片B;
        keepPrinting = 1;
    }
}
keepPrinting {打印; }

而这里的文件我测试了它：

 垃圾在这里不带编号开始
1 0.110 0.069
2 0.062 0.088
3 0.062 0.121
4 B 0.062 0.121
5℃0.032 0.100
6天0.032 0.100
[7] E 0.032 0.100
8 0.099 0.121
9 B 0.098 0.121
10℃0.097 0.100
11×0.000 0.200

下面就是我得到的，当我运行它：

  $ AWK -f blort.awk blort.txt
3 0.062 0.121
4 B 0.062 0.121
5℃0.032 0.100
6天0.032 0.100
[7] E 0.032 0.100
8 0.099 0.121
9 B 0.098 0.121
10℃0.097 0.100
11×0.000 0.200

I am not sure if the && operator works in regular expressions. What I am trying to do is match a line such that it starts with a number and has the letter 'a' AND the next line starts with a number and has the letter 'b' AND the next line... letter 'c'. This abc sequence will be used as a unique identifier to start reading the file.

Here is what I am sort of going for in awk.

/(^[0-9]+ .*a)&&\n(^[0-9]+ .*b)&&\n(^[0-9]+ .*c) {
print $0
}

Just one of these regex works like (^[0-9]+ .*a), but I am not sure how to string them together with AND THE NEXT LINE IS THIS.

My file would be like:

JUNK UP HERE NOT STARTING WITH NUMBER
1     a           0.110     0.069          
2     a           0.062     0.088          
3     a           0.062     0.121          
4     b           0.062     0.121          
5     c           0.032     0.100         
6     d           0.032     0.100          
7     e           0.032     0.100

And what I want is:

3     a           0.062     0.121          
4     b           0.062     0.121          
5     c           0.032     0.100         
6     d           0.032     0.100          
7     e           0.032     0.100

解决方案

[Update based on clarification.]

One high order bit is that Awk is a line-oriented language, so you won't actually be able to do a normal pattern match to span lines. The usual way to do something like this is to match each line separately, and have a later clause / statement figure out if all the right pieces have been matched.

What I'm doing here is looking for an a in the second field on one line, a b in the second field on another line, and a c in the second field on a third line. In the first two cases, I stash away the contents of the line as well as what line number it occurred on. When the third line is matched and we haven't yet found the whole sequence, I go back and check to see if the other two lines are present and with acceptable line numbers. If all's good, I print out the buffered previous lines and set a flag indicating that everything else should print.

Here's the script:

$2 == "a" { a = $0; aLine = NR; }
$2 == "b" { b = $0; bLine = NR; }
$2 == "c" && !keepPrinting {
    if ((bLine == (NR - 1)) && (aLine == (NR - 2))) {
        print a;
        print b;
        keepPrinting = 1;
    }
}
keepPrinting { print; }

And here's a file I tested it with:

JUNK UP HERE NOT STARTING WITH NUMBER
1     a           0.110     0.069
2     a           0.062     0.088
3     a           0.062     0.121
4     b           0.062     0.121
5     c           0.032     0.100
6     d           0.032     0.100
7     e           0.032     0.100
8     a           0.099     0.121
9     b           0.098     0.121
10    c           0.097     0.100
11    x           0.000     0.200

Here's what I get when I run it:

$ awk -f blort.awk blort.txt
3     a           0.062     0.121
4     b           0.062     0.121
5     c           0.032     0.100
6     d           0.032     0.100
7     e           0.032     0.100
8     a           0.099     0.121
9     b           0.098     0.121
10    c           0.097     0.100
11    x           0.000     0.200

这篇关于匹配的AWK多行正则表达式。＆功放;＆安培;运营商？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

匹配的AWK多行正则表达式。＆功放;＆安培;运营商？ [英] Matching regex of multiple lines in AWK. && operator?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

匹配的AWK多行正则表达式。 ＆功放;＆安培;运营商？ [英] Matching regex of multiple lines in AWK. &amp;&amp; operator?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

匹配的AWK多行正则表达式。＆功放;＆安培;运营商？ [英] Matching regex of multiple lines in AWK. && operator?

登录关闭