如何获取匹配正则表达式的第一行之后的文件部分 [英] How to get the part of a file after the first line that matches a regular expression
问题描述
我有一个大约 1000 行的文件.我想要文件中与我的 grep 语句匹配的行之后的部分.
I have a file with about 1000 lines. I want the part of my file after the line which matches my grep statement.
即:
cat file | grep 'TERMINATE' # It is found on line 534
所以,我希望文件从第 535 行到第 1000 行进行进一步处理.
So, I want the file from line 535 to line 1000 for further processing.
我该怎么做?
推荐答案
以下将打印与 TERMINATE
匹配的行,直到文件末尾:
The following will print the line matching TERMINATE
till the end of the file:
sed -n -e '/TERMINATE/,$p'
解释: -n
禁用 sed
在其上执行其脚本后打印每一行的默认行为,-e
表示 sed
的脚本,/TERMINATE/,$
是地址(行)范围选择,表示与 TERMINATE
匹配的第一行正则表达式(如 grep)到文件末尾($
),p
是打印当前行的打印命令.
Explained: -n
disables default behavior of sed
of printing each line after executing its script on it, -e
indicated a script to sed
, /TERMINATE/,$
is an address (line) range selection meaning the first line matching the TERMINATE
regular expression (like grep) to the end of the file ($
), and p
is the print command which prints the current line.
这将从匹配 TERMINATE
的行之后的行开始打印,直到文件末尾:(从AFTER匹配行到EOF,不包括匹配行)
This will print from the line that follows the line matching TERMINATE
till the end of the file:
(from AFTER the matching line to EOF, NOT including the matching line)
sed -e '1,/TERMINATE/d'
说明: 1,/TERMINATE/
是地址(行)范围选择,意思是输入到与 TERMINATE<匹配的第一行的第一行/code> 正则表达式,
d
是删除当前行并跳到下一行的删除命令.由于 sed
默认行为是打印行,它会打印 TERMINATE
之后到输入结束的行.
Explained: 1,/TERMINATE/
is an address (line) range selection meaning the first line for the input to the 1st line matching the TERMINATE
regular expression, and d
is the delete command which delete the current line and skip to the next line. As sed
default behavior is to print the lines, it will print the lines after TERMINATE
to the end of input.
如果你想要 TERMINATE
之前的行:
If you want the lines before TERMINATE
:
sed -e '/TERMINATE/,$d'
如果您希望在一次通过两个不同的文件中 TERMINATE
之前和之后的两行:
And if you want both lines before and after TERMINATE
in two different files in a single pass:
sed -e '1,/TERMINATE/w before
/TERMINATE/,$w after' file
before 和 after 文件将包含带有终止的行,因此要处理每个需要使用的行:
The before and after files will contain the line with terminate, so to process each you need to use:
head -n -1 before
tail -n +2 after
如果您不想在 sed 脚本中硬编码文件名,您可以:
IF you do not want to hard code the filenames in the sed script, you can:
before=before.txt
after=after.txt
sed -e "1,/TERMINATE/w $before
/TERMINATE/,$w $after" file
但是你必须对 $
表示最后一行进行转义,这样 shell 就不会尝试扩展 $w
变量(注意我们现在在周围使用双引号脚本而不是单引号).
But then you have to escape the $
meaning the last line so the shell will not try to expand the $w
variable (note that we now use double quotes around the script instead of single quotes).
我忘了告诉脚本中文件名后面的新行很重要,以便 sed 知道文件名结束.
I forgot to tell that the new line is important after the filenames in the script so that sed knows that the filenames end.
如何用变量替换硬编码的 TERMINATE
?
How would you replace the hardcoded TERMINATE
by a variable?
您可以为匹配的文本创建一个变量,然后按照与上一个示例相同的方式进行操作:
You would make a variable for the matching text and then do it the same way as the previous example:
matchtext=TERMINATE
before=before.txt
after=after.txt
sed -e "1,/$matchtext/w $before
/$matchtext/,$w $after" file
使用一个变量来匹配前面例子中的文本:
to use a variable for the matching text with the previous examples:
## Print the line containing the matching text, till the end of the file:
## (from the matching line to EOF, including the matching line)
matchtext=TERMINATE
sed -n -e "/$matchtext/,$p"
## Print from the line that follows the line containing the
## matching text, till the end of the file:
## (from AFTER the matching line to EOF, NOT including the matching line)
matchtext=TERMINATE
sed -e "1,/$matchtext/d"
## Print all the lines before the line containing the matching text:
## (from line-1 to BEFORE the matching line, NOT including the matching line)
matchtext=TERMINATE
sed -e "/$matchtext/,$d"
在这些情况下用变量替换文本的要点是:
The important points about replacing text with variables in these cases are:
- 用
单引号
['
] 括起来的变量($variablename
)不会展开";但是双引号
["
] 内的变量会.因此,如果所有单引号
包含要替换为变量的文本,则必须将它们更改为双引号
. sed
范围也包含一个$
并紧跟一个字母,例如:$p
,$d代码>, <代码>$w
.它们看起来也像要扩展的变量,因此您必须使用反斜杠 [] 转义那些
$
字符,例如:$p
,$d
,$w
.
- Variables (
$variablename
) enclosed insingle quotes
['
] won't "expand" but variables insidedouble quotes
["
] will. So, you have to change all thesingle quotes
todouble quotes
if they contain text you want to replace with a variable. - The
sed
ranges also contain a$
and are immediately followed by a letter like:$p
,$d
,$w
. They will also look like variables to be expanded, so you have to escape those$
characters with a backslash [] like:
$p
,$d
,$w
.
这篇关于如何获取匹配正则表达式的第一行之后的文件部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!