如何获取匹配正则表达式的第一行之后的文件部分 [英] How to get the part of a file after the first line that matches a regular expression

查看:23
本文介绍了如何获取匹配正则表达式的第一行之后的文件部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大约 1000 行的文件.我想要文件中与我的 grep 语句匹配的行之后的部分.

I have a file with about 1000 lines. I want the part of my file after the line which matches my grep statement.

即:

cat file | grep 'TERMINATE'     # It is found on line 534

所以,我希望文件从第 535 行到第 1000 行进行进一步处理.

So, I want the file from line 535 to line 1000 for further processing.

我该怎么做?

推荐答案

以下将打印与 TERMINATE 匹配的行,直到文件末尾:

The following will print the line matching TERMINATE till the end of the file:

sed -n -e '/TERMINATE/,$p'

解释: -n 禁用 sed 在其上执行其脚本后打印每一行的默认行为,-e 表示 sed 的脚本,/TERMINATE/,$ 是地址(行)范围选择,表示与 TERMINATE 匹配的第一行正则表达式(如 grep)到文件末尾($),p 是打印当前行的打印命令.

Explained: -n disables default behavior of sed of printing each line after executing its script on it, -e indicated a script to sed, /TERMINATE/,$ is an address (line) range selection meaning the first line matching the TERMINATE regular expression (like grep) to the end of the file ($), and p is the print command which prints the current line.

这将从匹配 TERMINATE 的行之后的行开始打印,直到文件末尾:(从AFTER匹配行到EOF,不包括匹配行)

This will print from the line that follows the line matching TERMINATE till the end of the file: (from AFTER the matching line to EOF, NOT including the matching line)

sed -e '1,/TERMINATE/d'

说明: 1,/TERMINATE/ 是地址(行)范围选择,意思是输入到与 TERMINATE<匹配的第一行的第一行/code> 正则表达式,d 是删除当前行并跳到下一行的删除命令.由于 sed 默认行为是打印行,它会打印 TERMINATE 之后到输入结束的行.

Explained: 1,/TERMINATE/ is an address (line) range selection meaning the first line for the input to the 1st line matching the TERMINATE regular expression, and d is the delete command which delete the current line and skip to the next line. As sed default behavior is to print the lines, it will print the lines after TERMINATE to the end of input.

如果你想要 TERMINATE 之前的行:

If you want the lines before TERMINATE:

sed -e '/TERMINATE/,$d'

如果您希望在一次通过两个不同的文件中 TERMINATE 之前和之后的两行:

And if you want both lines before and after TERMINATE in two different files in a single pass:

sed -e '1,/TERMINATE/w before
/TERMINATE/,$w after' file

before 和 after 文件将包含带有终止的行,因此要处理每个需要使用的行:

The before and after files will contain the line with terminate, so to process each you need to use:

head -n -1 before
tail -n +2 after

如果您不想在 sed 脚本中硬编码文件名,您可以:

IF you do not want to hard code the filenames in the sed script, you can:

before=before.txt
after=after.txt
sed -e "1,/TERMINATE/w $before
/TERMINATE/,$w $after" file

但是你必须对 $ 表示最后一行进行转义,这样 shell 就不会尝试扩展 $w 变量(注意我们现在在周围使用双引号脚本而不是单引号).

But then you have to escape the $ meaning the last line so the shell will not try to expand the $w variable (note that we now use double quotes around the script instead of single quotes).

我忘了告诉脚本中文件名后面的新行很重要,以便 sed 知道文件名结束.

I forgot to tell that the new line is important after the filenames in the script so that sed knows that the filenames end.

如何用变量替换硬编码的 TERMINATE?

How would you replace the hardcoded TERMINATE by a variable?

您可以为匹配的文本创建一个变量,然后按照与上一个示例相同的方式进行操作:

You would make a variable for the matching text and then do it the same way as the previous example:

matchtext=TERMINATE
before=before.txt
after=after.txt
sed -e "1,/$matchtext/w $before
/$matchtext/,$w $after" file

使用一个变量来匹配前面例子中的文本:

to use a variable for the matching text with the previous examples:

## Print the line containing the matching text, till the end of the file:
## (from the matching line to EOF, including the matching line)
matchtext=TERMINATE
sed -n -e "/$matchtext/,$p"

## Print from the line that follows the line containing the
## matching text, till the end of the file:
## (from AFTER the matching line to EOF, NOT including the matching line)
matchtext=TERMINATE
sed -e "1,/$matchtext/d"

## Print all the lines before the line containing the matching text:
## (from line-1 to BEFORE the matching line, NOT including the matching line)
matchtext=TERMINATE
sed -e "/$matchtext/,$d"

在这些情况下用变量替换文本的要点是:

The important points about replacing text with variables in these cases are:

  1. 单引号 ['] 括起来的变量($variablename)不会展开";但是 双引号 ["] 内的变量会.因此,如果所有 单引号 包含要替换为变量的文本,则必须将它们更改为 双引号.
  2. sed 范围也包含一个 $ 并紧跟一个字母,例如:$p, $d, <代码>$w.它们看起来也像要扩展的变量,因此您必须使用反斜杠 [] 转义那些 $ 字符,例如:$p, $d, $w.
  1. Variables ($variablename) enclosed in single quotes ['] won't "expand" but variables inside double quotes ["] will. So, you have to change all the single quotes to double quotes if they contain text you want to replace with a variable.
  2. The sed ranges also contain a $ and are immediately followed by a letter like: $p, $d, $w. They will also look like variables to be expanded, so you have to escape those $ characters with a backslash [] like: $p, $d, $w.

这篇关于如何获取匹配正则表达式的第一行之后的文件部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆