使用正则表达式通过Perl从纯文本中提取URL [英] Using regex to extract URLs from plain text with Perl

查看：95 发布时间：2020/5/25 18:57:13 regex perl url

本文介绍了使用正则表达式通过Perl从纯文本中提取URL的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何使用Perl正则表达式从纯文本中提取具有特定扩展名的特定域(可能具有可变子域)的所有URL?我已经尝试过:

How can I use Perl regexps to extract all URLs of a specific domain (with possibly variable subdomains) with a specific extension from plain text? I have tried:

my $stuff = 'omg http://fail-o-tron.com/bleh omg omg omg omg omg http://homepage.com/woot.gif dfgdfg http://shomepage.com/woot.gif aaa';
while($stuff =~ m/(http\:\/\/.*?homepage.com\/.*?\.gif)/gmsi)
{
print $1."\n";
}

它可怕地失败了，并给了我

It fails horribly and gives me:

http://fail-o-tron.com/bleh omg omg omg omg omg http://homepage.com/woot.gif
http://shomepage.com/woot.gif

我以为不会发生这种情况，因为我使用的是.*?，它应该是非贪婪的，并且给我最小的匹配.谁能告诉我我在做什么错? (我不想使用任何超级复杂的罐装正则表达式来验证URL；我想知道我做错了什么，所以我可以从中学习.)

I thought that wouldn't happen because I am using .*?, which ought to be non-greedy and give me the smallest match. Can anyone tell me what I am doing wrong? (I don't want some uber-complex, canned regexp to validate URLs; I want to know what I am doing wrong so I can learn from it.)

使用正则表达式通过Perl从纯文本中提取URL [英] Using regex to extract URLs from plain text with Perl

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用正则表达式通过Perl从纯文本中提取URL [英] Using regex to extract URLs from plain text with Perl

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭