如何使用sed来匹配任何网址? [英] How to use sed to match any URL?
问题描述
我想通过一个单一的令牌替换大集合文件的所有URL,但正规的前pression我这样做似乎没有正常工作
I'm trying to replace all URLs in a large collection of documents by a single token, however the regular expression I'm doing does not seem to work
S / WWW \\ [A-Z | 0-9] * [A-Z] * / URLTOKEN / G
这将匹配www.example.com或www.example.com.co。但如果有字符或之后的话,例如,www.foo.bar?q=lol或www.regexr.com/index.html?q=bar的另一种组合。我想,以适应任何字符组合后,我验证URL的第一部分,即www.example.com.co_(这里所说的那些字符)_,但是遇到一个空格之前。什么样的正则表达式中可以装在SED?
This would match www.example.com or www.example.com.co. But what if there is another combination of characters or words after that, e.g., www.foo.bar?q=lol or www.regexr.com/index.html?q=bar. I'd like to fit ANY combination of characters after I validate the first part of the url, i.e., www.example.com.co_(Here goes those characters)_, however before encountering a single space. What kind of regexp could fit that in sed?
推荐答案
SED的/ WWW \\。[^] * / URLTOKEN / G'文件
这篇关于如何使用sed来匹配任何网址?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!