sed 提取两个字符串之间的文本 [英] Sed to extract text between two strings
问题描述
请帮助我使用 sed.我有一个像下面这样的文件.
Please help me in using sed. I have a file like below.
START=A
xxxxx
xxxxx
END
START=A
xxxxx
xxxxx
END
START=A
xxxxx
xxxxx
END
START=B
xxxxx
xxxxx
END
START=A
xxxxx
xxxxx
END
START=C
xxxxx
xxxxx
END
START=A
xxxxx
xxxxx
END
START=D
xxxxx
xxxxx
END
我想获取 START=A, END 之间的文本.我使用了以下查询.
I want to get the text between START=A, END. I used the below query.
sed '/^START=A/, / ^END/!d' input_file
这里的问题是,我得到了
The problem here is , I am getting
START=A
xxxxx
xxxxx
END
START=D
xxxxx
xxxxx
END
代替
START=A
xxxxx
xxxxx
END
Sed 贪婪地发现.
请帮我解决这个问题.
提前致谢.
我可以使用 AWK 来实现上述目标吗?
Can I use AWK for achieving above?
推荐答案
sed -n '/^START=A$/,/^END$/p' data
-n
选项表示默认不打印;然后脚本说'在包含 START=A
的行和下一个 END
之间打印.
The -n
option means don't print by default; then the script says 'do print between the line containing START=A
and the next END
.
你也可以用 awk
来做到:
You can also do it with awk
:
一个模式可以由两个以逗号分隔的模式组成;在这种情况下,该操作是为从第一个模式出现到第二个模式出现的所有行.
A pattern may consist of two patterns separated by a comma; in this case, the action is performed for all lines from an occurrence of the first pattern though an occurrence of the second.
(来自 Mac OS X 上的 man awk
).
(from man awk
on Mac OS X).
awk '/^START=A$/,/^END$/ { print }' data
给定问题中数据文件的修改形式:
Given a modified form of the data file in the question:
START=A
xxx01
xxx02
END
START=A
xxx03
xxx04
END
START=A
xxx05
xxx06
END
START=B
xxx07
xxx08
END
START=A
xxx09
xxx10
END
START=C
xxx11
xxx12
END
START=A
xxx13
xxx14
END
START=D
xxx15
xxx16
END
输出使用 GNU sed
或 Mac OS X (BSD) sed
,并使用 GNU awk
或 BSD awk代码>,是一样的:
The output using GNU sed
or Mac OS X (BSD) sed
, and using GNU awk
or BSD awk
, is the same:
START=A
xxx01
xxx02
END
START=A
xxx03
xxx04
END
START=A
xxx05
xxx06
END
START=A
xxx09
xxx10
END
START=A
xxx13
xxx14
END
请注意我是如何修改数据文件的,以便更容易查看打印的各个数据块在文件中的来源.
Note how I modified the data file so it is easier to see where the various blocks of data printed came from in the file.
如果您有不同的输出要求(例如仅 START=A 和 END 之间的第一个块",或仅最后一个..."),那么您需要在问题中更清楚地阐明这一点.
If you have a different output requirement (such as 'only the first block between START=A and END', or 'only the last ...'), then you need to articulate that more clearly in the question.
这篇关于sed 提取两个字符串之间的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!