如何获取与正则表达式匹配的第一行之后的文件部分? [英] How to get the part of a file after the first line that matches a regular expression?

查看:198
本文介绍了如何获取与正则表达式匹配的第一行之后的文件部分?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大约有1000行的文件.我希望文件中与我的grep语句匹配的行之后的部分.

I have a file with about 1000 lines. I want the part of my file after the line which matches my grep statement.

也就是说:

$ cat file | grep 'TERMINATE'     # It is found on line 534

因此,我希望文件从535行到1000行进行进一步处理.

So, I want the file from line 535 to line 1000 for further processing.

我该怎么办?

推荐答案

以下将打印匹配TERMINATE的行,直到文件末尾:

The following will print the line matching TERMINATE till the end of the file:

sed -n -e '/TERMINATE/,$p'

说明: -n禁用在其上执行脚本后打印每行的sed的默认行为,-e指示sed的脚本,/TERMINATE/,$是地址(行)范围选择是指将TERMINATE正则表达式(如grep)与文件末尾($)匹配的第一行,而p是用于打印当前行的打印命令.

Explained: -n disables default behavior of sed of printing each line after executing its script on it, -e indicated a script to sed, /TERMINATE/,$ is an address (line) range selection meaning the first line matching the TERMINATE regular expression (like grep) to the end of the file ($), and p is the print command which prints the current line.

这将从与TERMINATE匹配的行之后的行开始打印,直到文件末尾:
(从匹配行到EOF,不包括匹配行)

This will print from the line that follows the line matching TERMINATE till the end of the file:
(from AFTER the matching line to EOF, NOT including the matching line)

sed -e '1,/TERMINATE/d'

说明::1,/TERMINATE/是地址(行)范围选择,表示第一行输入的第一行与TERMINATE正则表达式匹配,而d是delete命令删除当前行并跳至下一行.由于sed的默认行为是打印行,因此它将打印TERMINATE之后到输入末尾的行.

Explained: 1,/TERMINATE/ is an address (line) range selection meaning the first line for the input to the 1st line matching the TERMINATE regular expression, and d is the delete command which delete the current line and skip to the next line. As sed default behavior is to print the lines, it will print the lines after TERMINATE to the end of input.

如果要在TERMINATE之前的行:

sed -e '/TERMINATE/,$d'

如果您希望一次通过两个不同文件中的TERMINATE之前和之后的两行:

And if you want both lines before and after TERMINATE in 2 different files in a single pass:

sed -e '1,/TERMINATE/w before
/TERMINATE/,$w after' file

before和after文件将包含带有结尾的行,因此要处理每个文件,您需要使用:

The before and after files will contain the line with terminate, so to process each you need to use:

head -n -1 before
tail -n +2 after

如果您不想对sed脚本中的文件名进行硬编码,则可以:

IF you do not want to hard-code the filenames in the sed script, you can:

before=before.txt
after=after.txt
sed -e "1,/TERMINATE/w $before
/TERMINATE/,\$w $after" file

但是随后您必须转义$表示最后一行,以便外壳程序不会尝试扩展$w变量(请注意,我们现在在脚本周围使用双引号而不是单引号).

But then you have to escape the $ meaning the last line so the shell will not try to expand the $w variable (note that we now use double quotes around the script instead of single quotes).

我忘记告诉新行在脚本中的文件名之后很重要,以便sed知道文件名结束.

I forgot to tell that the new line is important after the filenames in the script so that sed knows that the filenames end.


2016-0530


2016-0530

塞巴斯蒂安·克莱门特(SébastienClément)问:如何用变量替换硬编码的TERMINATE?"

Sébastien Clément asked: "How would you replace the hardcoded TERMINATE by a variable?"

您将为匹配的文本创建一个变量,然后以与上一个示例相同的方式进行操作:

You would make a variable for the matching text and then do it the same way as the previous example:

matchtext=TERMINATE
before=before.txt
after=after.txt
sed -e "1,/$matchtext/w $before
/$matchtext/,\$w $after" file

在前面的示例中将变量用于匹配文本:

to use a variable for the matching text with the previous examples:

## Print the line containing the matching text, till the end of the file:
## (from the matching line to EOF, including the matching line)
matchtext=TERMINATE
sed -n -e "/$matchtext/,\$p"

## Print from the line that follows the line containing the 
## matching text, till the end of the file:
## (from AFTER the matching line to EOF, NOT including the matching line)
matchtext=TERMINATE
sed -e "1,/$matchtext/d"

## Print all the lines before the line containing the matching text:
## (from line-1 to BEFORE the matching line, NOT including the matching line)
matchtext=TERMINATE
sed -e "/$matchtext/,\$d"

在这种情况下,用变量替换文本的要点是:

The important points about replacing text with variables in these cases are:

    包含在single quotes [']中的
  1. 变量($variablename)不会扩展",但double quotes ["]中的变量会扩展".因此,如果所有single quotesdouble quotes包含要用变量替换的文本,则必须将它们全部更改.
  2. sed范围还包含一个$,并紧跟一个字母,例如:$p$d$w.它们看起来也像是要扩展的变量,因此您必须使用反斜杠[\]来转义那些$字符,例如:\$p\$d\$w.
  1. Variables ($variablename) enclosed in single quotes ['] won't "expand" but variables inside double quotes ["] will. So, you have to change all the single quotes to double quotes if they contain text you want to replace with a variable.
  2. The sed ranges also contain a $ and are immediately followed by a letter like: $p, $d, $w. They will also look like variables to be expanded, so you have to escape those $ characters with a backslash [\] like: \$p, \$d, \$w.

这篇关于如何获取与正则表达式匹配的第一行之后的文件部分?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆