SED，AWK，Perl或法：找到preFIX +正则表达式字符串，忽略输入的其余部分 [英] sed, awk, perl or lex: find strings by prefix+regex, ignoring rest of input

查看：105 发布时间：2016/7/29 11:19:49 regex perl sed awk lex

本文介绍了SED，AWK，Perl或法：找到preFIX +正则表达式字符串，忽略输入的其余部分的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要找到具有一定的preFIX字符串，其次是正则表达式，在一堆文件，但忽略输入的其余部分（包括preFIX前行的内容，并经过匹配的正则表达式的结束）。

I need to find strings with a certain prefix, followed by a regexp, in a bunch of files, but ignore the rest of the input (including the content of the line before the prefix, and after the end of the matching regexp).

什么是这个职位的最佳工具？ 的grep 找到完整的生产线; SED 通常只用于编辑和选择和替换; AWK ？ perl的？

What's the best tool for the job? grep finds complete lines; sed is usually used just for editing and select-and-replace; awk? perl?

我也想过法，但我真后，编译器编译？！

I also thought about lex, but am I really after a compiler compiler?!

编辑：输入是HTML文件几千，在preFIX +正规前pression将 https：//开头（[-.0-9A-ZA-Z] + \\ {2}）（其中我想 $ 1 ），输入的其余部分被忽略[A-ZA-Z]

the input is several thousand of HTML files, the prefix + regular expression would be https://([-.0-9A-Za-z]+\.[A-Za-z]{2,}) (of which I want $1), and the rest of the input ignored.

推荐答案

如果你不会有图案的多个在同一行，我可能会使用 SED ：

If you won't have more than one of the pattern on a single line, I'd probably use sed:

sed -n -e 's%.*https://\([-.0-9A-Za-z]\{1,\}\.[A-Za-z]\{2,\}\).*%\1%p'

给出的数据文件：

Given the data file:

Nothing here
Before https://example.com after
https://example.com and after
Before you get to https://www.example.com
And double your https://example.com for fun and happiness https://www.example.com in triplicate https://a.bb
and nothing here

的 SED 脚本生成每行一个条目，显示当有一个以上的上线的最后一个条目：

The sed script produces one entry per line, showing the last entry when there's more than one on the line:

example.com
example.com
www.example.com
a.bb

一个Perl脚本可用于每行的多个条目：

A Perl script can be used for multiple entries per line:

$ perl -nle 'print $1 while (m%https://([-.0-9A-Za-z]+\.[A-Za-z]{2,})%g);' data
example.com
example.com
www.example.com
example.com
www.example.com
a.bb
$

这篇关于SED，AWK，Perl或法：找到preFIX +正则表达式字符串，忽略输入的其余部分的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

SED，AWK，Perl或法：找到preFIX +正则表达式字符串，忽略输入的其余部分 [英] sed, awk, perl or lex: find strings by prefix+regex, ignoring rest of input

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录关闭

SED，AWK，Perl或法：找到preFIX +正则表达式字符串，忽略输入的其余部分 [英] sed, awk, perl or lex: find strings by prefix+regex, ignoring rest of input

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录 关闭

登录关闭