如何获取与正则表达式匹配的第一行之后的文件部分? [英] How to get the part of a file after the first line that matches a regular expression?
问题描述
我有一个大约有1000行的文件.我希望文件中与我的grep语句匹配的行之后的部分.
I have a file with about 1000 lines. I want the part of my file after the line which matches my grep statement.
也就是说:
$ cat file | grep 'TERMINATE' # It is found on line 534
因此,我希望文件从535行到1000行进行进一步处理.
So, I want the file from line 535 to line 1000 for further processing.
我该怎么办?
推荐答案
以下将打印匹配TERMINATE
的行,直到文件末尾:
The following will print the line matching TERMINATE
till the end of the file:
sed -n -e '/TERMINATE/,$p'
说明: -n
禁用在其上执行脚本后打印每行的sed
的默认行为,-e
指示sed
的脚本,/TERMINATE/,$
是地址(行)范围选择是指将TERMINATE
正则表达式(如grep)与文件末尾($
)匹配的第一行,而p
是用于打印当前行的打印命令.
Explained: -n
disables default behavior of sed
of printing each line after executing its script on it, -e
indicated a script to sed
, /TERMINATE/,$
is an address (line) range selection meaning the first line matching the TERMINATE
regular expression (like grep) to the end of the file ($
), and p
is the print command which prints the current line.
这将从与TERMINATE
匹配的行之后的行开始打印,直到文件末尾:
(从匹配行到EOF,不包括匹配行)
This will print from the line that follows the line matching TERMINATE
till the end of the file:
(from AFTER the matching line to EOF, NOT including the matching line)
sed -e '1,/TERMINATE/d'
说明::1,/TERMINATE/
是地址(行)范围选择,表示第一行输入的第一行与TERMINATE
正则表达式匹配,而d
是delete命令删除当前行并跳至下一行.由于sed
的默认行为是打印行,因此它将打印TERMINATE
之后到输入末尾的行.
Explained: 1,/TERMINATE/
is an address (line) range selection meaning the first line for the input to the 1st line matching the TERMINATE
regular expression, and d
is the delete command which delete the current line and skip to the next line. As sed
default behavior is to print the lines, it will print the lines after TERMINATE
to the end of input.
如果要在TERMINATE
之前的行:
sed -e '/TERMINATE/,$d'
如果您希望一次通过两个不同文件中的TERMINATE
之前和之后的两行:
And if you want both lines before and after TERMINATE
in 2 different files in a single pass:
sed -e '1,/TERMINATE/w before
/TERMINATE/,$w after' file
before和after文件将包含带有结尾的行,因此要处理每个文件,您需要使用:
The before and after files will contain the line with terminate, so to process each you need to use:
head -n -1 before
tail -n +2 after
如果您不想对sed脚本中的文件名进行硬编码,则可以:
IF you do not want to hard-code the filenames in the sed script, you can:
before=before.txt
after=after.txt
sed -e "1,/TERMINATE/w $before
/TERMINATE/,\$w $after" file
但是随后您必须转义$
表示最后一行,以便外壳程序不会尝试扩展$w
变量(请注意,我们现在在脚本周围使用双引号而不是单引号).
But then you have to escape the $
meaning the last line so the shell will not try to expand the $w
variable (note that we now use double quotes around the script instead of single quotes).
我忘记告诉新行在脚本中的文件名之后很重要,以便sed知道文件名结束.
I forgot to tell that the new line is important after the filenames in the script so that sed knows that the filenames end.
2016-0530
2016-0530
塞巴斯蒂安·克莱门特(SébastienClément)问:如何用变量替换硬编码的TERMINATE
?"
Sébastien Clément asked: "How would you replace the hardcoded TERMINATE
by a variable?"
您将为匹配的文本创建一个变量,然后以与上一个示例相同的方式进行操作:
You would make a variable for the matching text and then do it the same way as the previous example:
matchtext=TERMINATE
before=before.txt
after=after.txt
sed -e "1,/$matchtext/w $before
/$matchtext/,\$w $after" file
在前面的示例中将变量用于匹配文本:
to use a variable for the matching text with the previous examples:
## Print the line containing the matching text, till the end of the file:
## (from the matching line to EOF, including the matching line)
matchtext=TERMINATE
sed -n -e "/$matchtext/,\$p"
## Print from the line that follows the line containing the
## matching text, till the end of the file:
## (from AFTER the matching line to EOF, NOT including the matching line)
matchtext=TERMINATE
sed -e "1,/$matchtext/d"
## Print all the lines before the line containing the matching text:
## (from line-1 to BEFORE the matching line, NOT including the matching line)
matchtext=TERMINATE
sed -e "/$matchtext/,\$d"
在这种情况下,用变量替换文本的要点是:
The important points about replacing text with variables in these cases are:
-
包含在
- 变量(
$variablename
)不会扩展",但double quotes
["
]中的变量会扩展".因此,如果所有single quotes
到double quotes
包含要用变量替换的文本,则必须将它们全部更改. -
sed
范围还包含一个$
,并紧跟一个字母,例如:$p
,$d
,$w
.它们看起来也像是要扩展的变量,因此您必须使用反斜杠[\
]来转义那些$
字符,例如:\$p
,\$d
,\$w
.
single quotes
['
]中的- Variables (
$variablename
) enclosed insingle quotes
['
] won't "expand" but variables insidedouble quotes
["
] will. So, you have to change all thesingle quotes
todouble quotes
if they contain text you want to replace with a variable. - The
sed
ranges also contain a$
and are immediately followed by a letter like:$p
,$d
,$w
. They will also look like variables to be expanded, so you have to escape those$
characters with a backslash [\
] like:\$p
,\$d
,\$w
.
这篇关于如何获取与正则表达式匹配的第一行之后的文件部分?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!