原始文本上的正则表达式链接匹配 [英] Regex link matching on raw text

查看:68
本文介绍了原始文本上的正则表达式链接匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

任何人都有任何想法如何匹配此类字符串中的链接:

   http://www.youtube.com/Vid56xtghhttp://vimeo.com/channel/id=14498http://rapigator.net/file/345gtrG/mike_goes_mad.avi 





基本上这个字符串包含3个合并在一起的链接。我需要以某种方式将它分成3个不同的链接。用正则表达式做任何事情都可以吗?



谢谢!



我尝试过什么:



尝试使用正则表达式,但没有运气。不确定如何在另一个http(s)?://(www \。)?时停止它?到达块...

解决方案

你可以试试这个

 http :(?:(? !http:)。)* 



它将匹配从http:到下一次出现,但不包括它。

结果你的样本字符串将是3个单独的匹配。



[更新]

忘记https变种:

< pre lang =text> http(?:s)?:(?:( ?! http(?:s)?:)。)*


请参阅解决方案1了解工作解决方案。



此网站可以帮助查看正则表达式是否正在按预期执行。

Debuggex:在线可视正则表达式测试器。 JavaScript,Python和PCRE。 [ ^ ]



我想您必须阅读正则表达式文档以了解高级用法。

perlre - perldoc.perl.org [ ^ ]



到底是怎么回事,你是怎么以单个字符串中的3个链接那样愚蠢的结局?


Anyone got any ideas how to match links from such string:

"http://www.youtube.com/Vid56xtghhttp://vimeo.com/channel/id=14498http://rapigator.net/file/345gtrG/mike_goes_mad.avi"



Basically this string contains 3 links merged together. I need to separate it out somehow into 3 distinct links. Any way to do that with regular expression?

Thanks!

What I have tried:

Tried playing with regular expressions but without any luck. Not sure how to stop it when another "http(s)?://(www\.)?" block is reached...

解决方案

You can try this

http:(?:(?!http:).)*


It will match from http: up to the next occurence, but not including it.
The result of your sample string will be 3 separate matches.

[UPDATE]
Forgot the https variant:

http(?:s)?:(?:(?!http(?:s)?:).)*


See solution 1 for a working solution.

This site can help to see if a regex is doing what you expect.
Debuggex: Online visual regex tester. JavaScript, Python, and PCRE.[^]

I guess you will have to read regex documentation to understand advanced usages.
perlre - perldoc.perl.org[^]

By the hell, how did you end with something as stupid as the 3 links in a single string ?


这篇关于原始文本上的正则表达式链接匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆