识别URL的正则表达式 [英] Regular expression for recognizing url
问题描述
我想,以获得从输入字符串的所有链接创建一个正则表达式的URL。
正则表达式应该认识到的URL地址的格式如下:
- HTTP(S)://www.webpage.com
- HTTP(S)://webpage.com
- www.webpage.com
我有以下有一个
((www\ | HTTPS | FTP |鼠|远程登录|档案|注意事项| MS-HELP):((//)|(\\\ \\\\))+ [\w\d:?#@%/; $()〜_ \ + - = \\\&放大器;] *)
但它不能识别以下模式:www.webpage.com。 ?是否有人可以帮助我建立一个适当的正则表达式
编辑:
它应该致力于找到一个适当的链接,此外地方在这样一个适当的索引链接:
私人只读正则表达式RE_URL =新的正则表达式(@((HTTPS | FTP |地鼠|远程登录|档案|注意事项| MS-帮助):((//)|(\\\\))+ [\w\d:#@%/; $()〜_? \ + - = \\\&放大器;] *),RegexOptions.Multiline);
的foreach(赛比赛中(RE_URL.Matches(新文本)))
{
//从最后一个位置复制原始字符串到比赛
如果(match.Index! = last_pos)
{
VAR raw_text = new_text.Substring(last_pos,match.Index - last_pos);
text_block.Inlines.Add(新润(raw_text));
}
//创建匹配
无功链接一个超链接=新的超链接(新润(match.Value))
{
NavigateUri =新的URI(match.Value)
};
link.Click + = OnUrlClick;
text_block.Inlines.Add(链接);
//更新最后匹配位置
last_pos = match.Index + match.Length;
}
我刚刚写了一个博客帖子上承认在最常用格式的URL,如:
www.google.com
http://www.google.com
至mailto:somebody@google.com
somebody@google.com
www.url-with-querystring.com/?url=has-querystring
使用正则表达式是 /((([A-ZA-Z] {3,9} :(?:\ / ????\ /))(: - ;:&安培; = \ \ + $,\w] + @)[A-ZA-Z0-9 .-] + |(:WWW | [ - ;:&安培; = \ \ + $,\w] + @)[A-ZA-Z0-9 .-] +)((?:\ / [\ +〜%\ / .\w -_] *)\ ??(: - \ + =放;;%@ \w _] *)#(?:????[\w] *))) /
不过,我会建议你得的 http://blog.mattheworiordan.com/post/13174566389/url-regular-expression-for-links-with-or-without-the 沿看到一个完整的工作示例在情况下,你需要扩展或调整它的正则表达式的说明。
I want to create a Regex for url in order to get all links from input string. The Regex should recognize the following formats of the url address:
- http(s)://www.webpage.com
- http(s)://webpage.com
- www.webpage.com
and also the more complicated urls like: - http://www.google.pl/#sclient=psy&hl=pl&site=&source=hp&q=regex+url&pbx=1&oq=regex+url&aq=f&aqi=g1&aql=&gs_sm=e&gs_upl=1582l3020l0l3199l9l6l0l0l0l0l255l1104l0.2.3l5l0&bav=on.2,or.r_gc.r_pw.&fp=30a1604d4180f481&biw=1680&bih=935
I have the following one
((www\.|https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))+[\w\d:#@%/;$()~_?\+-=\\\.&]*)
but it does not recognize the following pattern: www.webpage.com. Can someone please help me to create an appropriate Regex?
EDIT: It should works to find an appropriate link and moreover place a link in an appropriate index like this:
private readonly Regex RE_URL = new Regex(@"((https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))+[\w\d:#@%/;$()~_?\+-=\\\.&]*)", RegexOptions.Multiline);
foreach (Match match in (RE_URL.Matches(new_text)))
{
// Copy raw string from the last position up to the match
if (match.Index != last_pos)
{
var raw_text = new_text.Substring(last_pos, match.Index - last_pos);
text_block.Inlines.Add(new Run(raw_text));
}
// Create a hyperlink for the match
var link = new Hyperlink(new Run(match.Value))
{
NavigateUri = new Uri(match.Value)
};
link.Click += OnUrlClick;
text_block.Inlines.Add(link);
// Update the last matched position
last_pos = match.Index + match.Length;
}
I've just written up a blog post on recognising URLs in most used formats such as:
www.google.com
http://www.google.com
mailto:somebody@google.com
somebody@google.com
www.url-with-querystring.com/?url=has-querystring
The regular expression used is /((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)/
however I would recommend you got to http://blog.mattheworiordan.com/post/13174566389/url-regular-expression-for-links-with-or-without-the to see a complete working example along with an explanation of the regular expression in case you need to extend or tweak it.
这篇关于识别URL的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!