通过http获取目录列表 [英] Getting directory listing over http

查看:295
本文介绍了通过http获取目录列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个目录通过网络提供,我有兴趣监控。它的内容是我正在使用的各种版本的软件,我想编写一个我可以运行的脚本来检查那里的内容,并下载任何比我已经获得的更新的内容。

There is a directory that is being served over the net which I'm interested in monitoring. Its contents are various versions of software that I'm using and I'd like to write a script that I could run which checks what's there, and downloads anything that is newer that what I've already got.

有没有办法,比如用 wget 或其他东西来获取目录列表。我已经尝试在目录上使用 wget ,这给了我html。为了避免解析html文档,有没有办法检索像 ls 这样的简单列表?

Is there a way, say with wget or something, to get a a directory listing. I've tried using wget on the directory, which gives me html. To avoid having to parse the html document, is there a way of retrieving a simple listing like ls would give?

推荐答案

我只是找到了办法:

$ wget --spider -r --no-parent http://some.served.dir.ca/

这是相当的详细,所以你需要管理几次 grep ,具体取决于你所追求的内容,但信息就在那里。它看起来像打印到stderr,所以附加 2>& 1 来让 grep 。我贪图\ .tar\.gz找到网站提供的所有tar包。

It's quite verbose, so you need to pipe through grep a couple of times depending on what you're after, but the information is all there. It looks like it prints to stderr, so append 2>&1 to let grep at it. I grepped for "\.tar\.gz" to find all of the tarballs the site had to offer.

注意 wget 将临时文件写入工作目录,但不清除其临时目录。如果这是一个问题,您可以更改为临时目录:

Note that wget writes temporary files in the working directory, and doesn't clean up its temporary directories. If this is a problem, you can change to a temporary directory:

$ (cd /tmp && wget --spider -r --no-parent http://some.served.dir.ca/)

这篇关于通过http获取目录列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆