如何使用Perl从HTML文件提取链接? [英] How can I extract links from an HTML file with Perl?
问题描述
我有一个带有链接的输入,我想打开该链接.例如,我有一个HTML文件,想要在文件中查找所有链接并在Excel电子表格中打开其内容.
I have some input with a link and I want to open that link. For instance, I have an HTML file and want to find all links in the file and open their contents in an Excel spreadsheet.
推荐答案
听起来您需要我的 HTML :: SimpleLinkExtor 模块.
It sounds like you want the linktractor script from my HTML::SimpleLinkExtor module.
您可能也对我的 webreaper 脚本感兴趣.我很久很久以前就写这篇文章,以完成与此任务相似的事情.我真的不推荐这样做,因为现在其他工具要好得多,但是至少您可以看一下代码.
You might also be interested in my webreaper script. I wrote that a long, long time ago to do something close to this same task. I don't really recommend it because other tools are much better now, but you can at least look at the code.
CPAN 和 Mojo :: UserAgent 对此也非常好:>
Mojo::UserAgent is quite nice for this, too:
use Mojo::UserAgent
print Mojo::UserAgent
->new
->get( $ARGV[0] )
->res
->dom->find( "a" )
->map( attr => "href" )
->join( "\n" );
这篇关于如何使用Perl从HTML文件提取链接?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!