SED单行 - 符查找周边对关键字 [英] sed one-liner - Find delimiter pair surrounding keyword

查看：354 发布时间：2016/8/3 12:03:43 xml bash sed grep

本文介绍了SED单行 - 符查找周边对关键字的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一般大的XML文件的工作，一般通过的grep 做字数以确认某些统计数据。

I typically work with large XML files, and generally do word counts via grep to confirm certain statistics.

例如，我要确保我通过在一个XML文件小部件中至少有五个实例：

For example, I want to make sure I have at least five instances of widget in a single xml file via:

cat test.xml | grep -ic widget

此外，我只是想能够登录了行小部件上出现，例如：

Additionally, I just like to be able to log the line that widget appears on, ie:

cat test.xml | grep -i widget > ~/log.txt

不过，我真正需要的关键信息是XML code那块部件出现在示例文件可能看起来像：

However, the key information I really need is the block of XML code that widget appears in. An example file may look like:

<test> blah blah
  blah blah blah
  widget
  blah blah blah
</test>

<formula>
  blah
  <details> 
    widget
  </details>
</formula>

我试图让从以上示例文本下面的输出，例如：

I am trying to get the following output from the sample text above, ie:

<test>widget</test>

<formula>widget</formula>

实际上，我试图让一切适用于XML文本/ code是围绕任意字符串块标记标签的最高级别的单行线，小部件。

有没有人有通过命令行实现这一个班轮有什么建议？

Does anyone have any suggestions for implementing this via a command-line one liner?

感谢您。

推荐答案

使用非优雅的方式既 SED 和 AWK ：

A non-elegant way using both sed and awk:

sed -ne '/[Ww][Ii][Dd][Gg][Ee][Tt]/,/^<\// {//p}' file.txt | awk 'NR%2==1 { sub(/^[ \t]+/, ""); search = $0 } NR%2==0 { end = $0; sub(/^<\//, "<"); printf "%s%s%s\n", $0, search, end }'

结果：

<test>widget</test>
<formula>widget</formula>

说明：

## The sed pipe:

sed -ne '/[Ww][Ii][Dd][Gg][Ee][Tt]/,/^<\// {//p}'
## This finds the widget pattern, ignoring case, then finds the last, 
## highest level markup tag (these must match the start of the line)
## Ultimately, this prints two lines for each pattern match

## Now the awk pipe:

NR%2==1 { sub(/^[ \t]+/, ""); search = $0 }
## This takes the first line (the widget pattern) and removes leading
## whitespace, saving the pattern in 'search'

NR%2==0 { end = $0; sub(/^<\//, "<"); printf "%s%s%s\n", $0, search, end }
## This finds the next line (which is even), and stores the markup tag in 'end'
## We then remove the slash from this tag and print it, the widget pattern, and
## the saved markup tag

心连心

这篇关于SED单行 - 符查找周边对关键字的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

SED单行 - 符查找周边对关键字 [英] sed one-liner - Find delimiter pair surrounding keyword

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录关闭

SED单行 - 符查找周边对关键字 [英] sed one-liner - Find delimiter pair surrounding keyword

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录 关闭

登录关闭