C#查找并用正则表达式替换字符串中的URL [英] c# find and replace urls from string with regex

查看:88
本文介绍了C#查找并用正则表达式替换字符串中的URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想替换网址,例如www.google.com或

在我使用过 $ 1 的地方,您也可以使用 $ 2-$ 5 .检查上面的图像,该图像显示了哪些组正在捕获URL的哪一部分.

可以找到完整的测试


根据评论,组标题的工作方式:

 文本:这是您要搜索的文本"模式:文字转至" 

匹配[0]将始终将您整个匹配的文本与匹配.上面的每个组,例如 Match [1] Match [2] ,都必须使用("和)"定义.

 文本:这是您要搜索的文本"模式:文本(至)"匹配[0]:文本为"匹配[1]:至"模式:文本(t(o))"匹配[0]:文本为"匹配[1]:至"匹配[2]:"o" 

带有()"的标题从外到内起作用.

  $ 1((((http | ftp | https):\/\/)?[\ w \ -__] +(\.[\ w \ -_] +)+([\ w \-\.,@?^ =%& amp :: \/〜\ +#] * [\ w \-\ @?^ =%& \/〜\ +#])?)↑--------------------------------------------------------------------------------------------------↑$ 2(http://)((((http | ftp | https):\/\/)?[\ w \ -__] +(\.[\ w \ -_] +)+([\ w \-\.,@?^ =%& amp :: \/〜\ +#] * [\ w \-\ @?^ =%& \/〜\ +#])?)↑---------------------↑$ 3(http)((((http | ftp | https):\/\/)?[\ w \ -__] +(\.[\ w \ -_] +)+([\ w \-\.,@?^ =%& amp :: \/〜\ +#] * [\ w \-\ @?^ =%& \/〜\ +#])?)↑--------------↑$ 4(.com)((((http | ftp | https):\/\/)?[\ w \ -__] +(\.[\ w \ -_] +)+([\ w \-\.,@?^ =%& amp :: \/〜\ +#] * [\ w \-\ @?^ =%& \/〜\ +#])?)↑----------↑$ 5(/appendedSubdirectory/anotherOne)((((http | ftp | https):\/\/)?[\ w \ -__] +(\.[\ w \ -_] +)+([\ w \-\.,@?^ =%& amp :: \/〜\ +#] * [\ w \-\ @?^ =%& \/〜\ +#])?)↑--------------------------------------------------↑ 

我无法在此处解释有关正则表达式的所有内容.这个问题为我解决了.如果您根据正则表达式有更深层的问题,请开始一个新的问题,并展示您之前所做的努力.

i want to replace url for example www.google.com or http://www.google.com with www.google.com i have a code for this

str = Regex.Replace(str,
                @"((http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?)",
                "<a target='_blank' href='$1'>$1</a>");

it is working with http://www.google.com but it is not working with www.google.com or subdomain.google.com which regex code matches with every url links. and when i wrote a long link it will write same of the url for example

http://www.google.com/search/asdadad/sdsdsd/sadasdx-sadasd-weqeqwe-zxcxzc.com

. i want to write it as

<a href="http://www.google.com/search/asdadad/sdsdsd/sadasdx-sadasd-weqeqwe-zxcxzc.com">google.com/asdas... </a>

what is the best way to make this? i am new for regex

解决方案

This will also catch www.test.com:

(((http|ftp|https):\/\/)?[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:\/~\+#]*[\w\-\@?^=%&amp;\/~\+#])?)
 ↑---------------------↑↑

Just surround the part thats optional and append a questionmark. You can check it out here.


The first match in this regex (matches are defined with "(" and ")") is the whole url. So you could use replacing like this:

Regex rgxUrls = new Regex(pattern);
string result = rgxUrls.Replace(yourText, "<a href=\"$1\"> space for custom text </a>");
                                                      ↑ Inserts first match

Where I've used $1 you can also use $2 - $5. Check the image above thats showing which groups are capturing which part of the url.

Full test can be found here.
Just click execute on the top.

Output:


According the comments, how group caption works:

Text: "this is your text to search"  
Pattern: "text to"

Match[0] will always match your whole match text to. Every groups above like Match[1] or Match[2] has to be defined with "(" and ")".

Text: "this is your text to search"  
Pattern: "text (to)"  
Match[0]: "text to"  
Match[1]: "to"  


Pattern: "text (t(o))"  
Match[0]: "text to"  
Match[1]: "to"  
Match[2]: "o"  

The caption with "()" works from the outside to the inside.

$1
(((http|ftp|https):\/\/)?[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:\/~\+#]*[\w\-\@?^=%&amp;\/~\+#])?)
↑--------------------------------------------------------------------------------------------------↑

$2 (http://)
(((http|ftp|https):\/\/)?[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:\/~\+#]*[\w\-\@?^=%&amp;\/~\+#])?)
 ↑---------------------↑

$3 (http)
(((http|ftp|https):\/\/)?[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:\/~\+#]*[\w\-\@?^=%&amp;\/~\+#])?)
  ↑--------------↑

$4 (.com)
(((http|ftp|https):\/\/)?[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:\/~\+#]*[\w\-\@?^=%&amp;\/~\+#])?)
                                 ↑----------↑   

$5 (/appendedSubdirectory/anotherOne)
(((http|ftp|https):\/\/)?[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:\/~\+#]*[\w\-\@?^=%&amp;\/~\+#])?)
                                              ↑--------------------------------------------------↑   

I cant explain everything about regex in here. This question looks solved for me. If you've got deeper questions according regex start a new one and show some effort you've done before.

这篇关于C#查找并用正则表达式替换字符串中的URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆