Grep 访问多行,查找两个模式之间的所有单词 [英] Grep Access Multiple lines, find all words between two patterns

查看:18
本文介绍了Grep 访问多行,查找两个模式之间的所有单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

需要帮助扫描文本文件并找到两个模式之间的所有单词.比如说,如果我们有一个 .sql 文件,需要扫描并找到 from' 和 'where' 之间的所有单词.Grep 一次只能扫描 1 行.对于这个要求,最好使用的 Unix 脚本是什么?sed、awk有这些功能吗?非常感谢您指出任何示例.

Need help in scanning text files and find all the words between two patterns. Like say if we have a .sql file, Need to scan and find all words between from' and 'where'. Grep can only scan 1 line at a time. For this requirement what is the best unix script to use? sed, awk has these features? Pointing to any examples is greatly appreciated.

推荐答案

Sed 有这个:

sed -n -e '/from/,/where/ p' file.sql

打印带有 from 的行和带有 where 的行之间的所有行.

Prints all the lines between a line with a from and a line with a where.

对于可以包含具有 from 和 where 的行的内容:

For something that can include lines that have both from and where:

#!/bin/sed -nf

/from.*where/ {
    s/.*(from.*where).*/1/p
    d
}
/from/ {
    : next
    N
    /where/ {
        s/^[^
]*(from.*where)[^
]*/1/p
        d
    }
    $! b next
}

这个(写成 sed 脚本)稍微复杂一些,我会尽量解释细节.

This (written as a sed script) is slightly more complex, and I'll try to explain the details.

第一行在包含 fromwhere 的行上执行.如果一行与该模式匹配,则执行两个命令.我们使用 s 替换命令只提取 from 和 where 之间的部分(包括 from 和 where).该命令中的 p 后缀打印该行.delete 命令清除模式空间(工作缓冲区),加载下一行并重新启动脚本.

The first line is executed on a line that contains both a from and a where. If a line matches that pattern, two commands are executed. We use the s substitute command to extract only the parts between from and where (including the from and where). The p suffix in that command prints the line. The delete command clears the pattern space (the working buffer), loading the next line and restarting the script.

当找到包含 from 的行时,第二个命令开始执行一系列命令(由大括号分组).基本上,这些命令形成一个循环,该循环将继续从输入到模式空间中添加行,直到找到带有 where 的行或直到我们到达最后一行.

The second command starts to execute a series of commands (grouped by the braces) when a line containing from is found. Basically, the commands form a loop that will keep appending lines from the input into the pattern space until a line with a where is found or until we reach the last line.

:命令"创建了一个标签,脚本中的一个标记,允许我们在需要时跳"回.N 命令从输入中读取一行,并将其附加到模式空间(用换行符分隔各行).

The : "command" creates a label, a marker in the script that allows us to "jump" back when we want to. The N command reads a line from the input, and appends it to the pattern space (separating the lines with a newline character).

当找到where时,我们可以打印出模式空间的内容,但首先我们必须用substitute命令清理它.它与之前使用的类似,但我们现在将前导和尾随 .* 替换为 [^ ]*,它告诉 sed 只匹配非换行符字符,有效匹配第一行中的 from 和最后一行中的 where.d 命令然后清除模式空间并在下一行重新启动脚本.

When a where is found, we can print out the contents of the pattern space, but first we have to clean it with the substitute command. It is analogous to the one used previously, but we now replace the leading and trailing .* with [^ ]*, which tells sed to match only non-newline characters, effectively matching a from in the first line and a where in the last line. The d command then clears the pattern space and restarts the script on the next line.

b 命令将跳转到一个标签,在我们的例子中是标签 next.但是,$! 地址说它不应该在最后一行执行,允许我们离开循环.以这种方式离开循环时,我们还没有找到相应的where,因此您可能不想打印它.

The b command will jump to a label, in our case, the label next. However, the $! address says it should not be executed on the last line, allowing us to leave the loop. When leaving the loop this way, we haven't found a respective where, so you may not want to print it.

但是请注意,这有一些缺点.以下情况将不会按预期处理:

Note however, this has some drawbacks. The following cases won't be handled as expected:

from ... where ... from

from ... from
where

from
where ... where

from
from
where
where

处理这些情况需要更多代码.

Handling these cases require more code.

希望这有帮助 =)

这篇关于Grep 访问多行,查找两个模式之间的所有单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆