我如何获得"grep -zoP"?分别显示每个比赛? [英] How can I get "grep -zoP" to display every match separately?

查看:83
本文介绍了我如何获得"grep -zoP"?分别显示每个比赛?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在此表单上有一个文件:

I have a file on this form:

X/this is the first match/blabla
X-this is
the second match-

and here we have some fluff.

我想提取出现在"X"之后的所有内容.并在相同的标记之间因此,如果我有"X + match +",我想获得"match",因为它出现在"X + match +"之后.并在标记"+"之间.

And I want to extract everything that appears after "X" and between the same markers. So if I have "X+match+", I want to get "match", because it appears after "X" and between the marker "+".

因此对于给定的示例文件,我希望获得以下输出:

So for the given sample file I would like to have this output:

this is the first match

然后

this is
the second match

我设法通过使用以下命令在X和标记之间获取所有内容:

I managed to get all the content between X followed by a marker by using:

grep -zPo '(?<=X(.))(.|\n)+(?=\1)' file

也就是说:

  • grep -Po'(?< = X(.))(.| \ n)+(?= \ 1)'匹配X,后跟(something)最终被捕获并与(?= \ 1)匹配(我的代码基于我的答案在这里).
  • 请注意,我使用(.| \ n)来匹配任何内容,包括换行,并且我还在grep中也使用了 -z 来匹配新行
  • grep -Po '(?<=X(.))(.|\n)+(?=\1)' to match X followed by (something) that gets captured and matched at the end with (?=\1) (I based the code on my answer here).
  • Note I use (.|\n) to match anything, including a new line, and that I also use -z in grep to match new lines as well.

所以这很好用,唯一的问题来自输出的显示:

So this works well, the only problem comes from the display of the output:

$ grep -zPo '(?<=X(.))(.|\n)+(?=\1)' file
this is the first matchthis is
the second match

如您所见,所有匹配项一起显示,而这是第一个匹配项"后面跟着这是第二场比赛".根本没有分隔符.我知道这来自"-z"的使用,它将所有文件视为一组行,每行以零字节(ASCII NUL字符)而不是换行符(引号)结尾"man grep").

As you can see, all the matches appear together, with "this is the first match" being followed by "this is the second match" with no separator at all. I know this comes from the usage of "-z", that treats all the file as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline (quoting "man grep").

所以:有没有办法分别获得所有这些结果?

So: is there a way to get all these results separately?

我也在GNU Awk中尝试过:

I tried also in GNU Awk:

awk 'match($0, /X(.)(\n|.*)\1/, a) {print a[1]}' file

,但(\ n |.*)都不起作用.

推荐答案

用例有点问题,因为一旦您打印匹配项,就会丢失有关分隔符确切位置的信息.但是,如果可以接受,请尝试管道传输到 xargs -r0 .

The use case is kind of problematic, because as soon as you print the matches, you lose the information about where exactly the separator was. But if that's acceptable, try piping to xargs -r0.

grep -zPo '(?<=X(.))(.|\n)+(?=\1)' file | xargs -r0

这些选项是GNU扩展,但是 grep -z 和(主要是) grep -P 也是如此,所以也许可以接受.

These options are GNU extensions, but then so is grep -z and (mostly) grep -P, so perhaps that's acceptable.

这篇关于我如何获得"grep -zoP"?分别显示每个比赛?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆