如何从Perl发出HTTP GET请求? [英] How can I make an HTTP GET request from Perl?

查看:831
本文介绍了如何从Perl发出HTTP GET请求?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写我的第一个Perl程序。如果您认为Perl对于手头的任务来说是一种糟糕的语言,请告诉我哪种语言可以更好地解决它。

I'm trying to write my first Perl program. If you think that Perl is a bad language for the task at hand tell me what language would solve it better.

该程序测试给定机器和远程Apache服务器之间的连接。
首先,程序从Apache服务器请求目录列表,而不是解析列表并逐个下载所有文件。如果文件出现问题(连接在达到指定的Content-Length之前重置),则应记录此信息并检索下一个文件。无需保存文件甚至检查完整性,我只需要记录完成所需的时间以及连接重置的所有情况。

The program tests connectivity between given machine and remote Apache server. At first program requests the directory listing from the Apache server, than it parses the list and downloads all files one by one. Should there be a problem with file (connection resets before reaching the specified Content-Length) this should be logged and next file should be retrieved. There is no need to save the files or even check the integrity, I only need to log the time it takes to complete and all cases where connection resets.

要检索来自Apache生成的目录索引的链接列表我计划使用类似于

To retrieve the list of links from Apache-generated directory index I plan to use regexp similar to

/href=\"([^\"]+)\"/

regexp尚未调试,确实。

The regexp is not debugged yet, indeed.

从Perl做HTTP请求的参考方式是什么?我用谷歌搜索并找到了使用许多不同库的示例,其中一些是商业化的。我需要一些可以检测到断开连接的东西(超时)或者TCP重置)并处理这些。

What is the "reference" way to do HTTP request from Perl? I googled and found examples using many different libraries, some of them commercial. I need something that can detect disconnections (timeout or TCP reset) and handle these.

另一个问题。如何以最小的编码工作全局搜索字符串列表时,如何存储我的正则表达式捕获的所有内容?

Another question. How do I store everything caught by my regexp when searching globally as a list of string with the minimal coding effort?

推荐答案

就整个问题描述而言,我会使用 WWW ::机械化。 Mechanize是 LWP :: UserAgent 的子类,它添加了有状态行为和HTML解析。使用mech,您只需 $ mech-> get($ url_of_index_page),然后使用 $ mech-> find_all_links(条件)选择要关注的链接。

As far as the whole problem description goes, I would use WWW::Mechanize. Mechanize is a subclass of LWP::UserAgent that adds stateful behavior and HTML parsing. With mech, you can just do $mech->get($url_of_index_page), and then use $mech->find_all_links(criteria) to select the links to follow.

这篇关于如何从Perl发出HTTP GET请求?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆