SED / AWK - 模式之间的打印文本多行跨区 [英] sed/awk - print text between patterns spanned across multiple lines

查看:381
本文介绍了SED / AWK - 模式之间的打印文本多行跨区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新来的脚本,并试图了解如何提取这两种不同的模式之间存在的任何文本。不过,我仍然无法弄清楚如何在两个模式之间提取文本中的以下情形:

I am new to scripting and was trying to learn how to extract any text that exists between two different patterns. However, I am still not able to figure out how to extract text between two patterns in the following scenario:

如果我有我的输入文件阅读:

Hi I would like
to print text
between these 
patterns

和我预期的输出是这样的:

I would like
to print text
between these 

即。我的第一个搜索模式是你好,并跳过此模式,但打印以下匹配模式存在于同一行的一切。我的第二个搜索模式是模式,我想完全避免打印此行或超过任何行

i.e. my first search pattern is "Hi' and skip this pattern, but print everything that exists in the same line following that matched pattern. My second search pattern is "patterns" and I would like to completely avoid printing this line or any lines beyond that.

我试过如下:

sed -n '/Hi/,/patterns/p' test.txt 

[输出]

Hi I would like
to print text
between these 
patterns 

接下来,我想:

`awk ' /'"Hi"'/ {flag=1;next} /'"pattern"'/{flag=0} flag { print }'` test.txt 

[输出]

to print text
between these

有人可以帮助我确定如何实现这一目标?
在此先感谢

Can someone help me out in identifying how to achieve this? Thanks in advance

推荐答案

您有正确的想法,迷你状态机在 AWK ,但你需要一些轻微的mods按照下面的成绩单:

You have the right idea, a mini-state-machine in awk but you need some slight mods as per the following transcript:

pax> echo 'Hi I would like
to print text
between these 
patterns ' | awk '
    /patterns/ { echo = 0 }
    /Hi /      { gsub("^.*Hi ", "", $0); echo = 1 }
               { if (echo == 1) { print } }'

或者,在COM pressed形式:

Or, in compressed form:

awk '/patterns/{e=0}/Hi /{gsub("^.*Hi ","",$0);e=1}{if(e==1){print}}'

,它的输出是:

I would like
to print text
between these 

的要求。

这工作方式如下。在回声变量最初是 0 这意味着没有呼应会发生。

The way this works is as follows. The echo variable is initially 0 meaning that no echoing will take place.

每个行依次检查。如果它包含模式,呼应是禁用的。

Each line is checked in turn. If it contains patterns, echoing is disabled.

如果它包含你好后面加一个空格,呼应开启的 GSUB 用于修改该行摆脱一切到你好

If it contains Hi followed by a space, echoing is turned on and gsub is used to modify the line to get rid of everything up to the Hi.

然后,不管了,行了(可能修改)是呼应当回声标志上。

Then, regardless, the line (possibly modified) is echoed when the echo flag is on.

现在,那里将是边缘情况,如:

Now, there's going to be edge cases such as:


  • 包含你好的两次出现线;或

  • 包含前的东西的图案线条

  • lines containing two occurrences of Hi; or
  • lines containing something before the patterns.

您还没有指定他们应该如何处理,所以我并没有理会,但基本的概念应该是相同的。

You haven't specified how they should be handled so I didn't bother, but the basic concept should be the same.

这篇关于SED / AWK - 模式之间的打印文本多行跨区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆