相当于Python中的wget下载网站和资源 [英] Equivalent of wget in Python to download website and resources

查看:20
本文介绍了相当于Python中的wget下载网站和资源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

2.5 年前在 在 Python 中下载网页及其所有资源文件,但没有得到答案,并且请参阅相关主题"并不是真正在问同样的事情.

Same thing asked 2.5 years ago in Downloading a web page and all of its resource files in Python but doesn't lead to an answer and the 'please see related topic' isn't really asking the same thing.

我想下载页面上的所有内容,以便仅从文件中查看.

I want to download everything on a page to make it possible to view it just from the files.

命令

wget --page-requisites --domains=DOMAIN --no-parent --html-extension --convert-links --restrict-file-names=windows

wget --page-requisites --domains=DOMAIN --no-parent --html-extension --convert-links --restrict-file-names=windows

正是我需要的.然而,我们希望能够将它与其他必须可移植的东西联系起来,因此要求它在 Python 中.

does exactly that I need. However we want to be able to tie it in with other stuff that must be portable, so requires it to be in Python.

我一直在看Beautiful Soup、scrapy、到处张贴的各种蜘蛛,但这些似乎都以巧妙但具体的方式处理获取数据/链接.使用这些来做我想做的事情似乎需要做很多工作才能找到所有资源,而我确信必须有一种简单的方法.

I've been looking at Beautiful Soup, scrapy, various spiders posted around the place, but these all seem to deal with getting data/links in clever but specific ways. Using these to do what I want seems like it will require a lot of work to deal with finding all of the resources, when I'm sure there must be an easy way.

非常感谢

推荐答案

您应该为手头的工作使用合适的工具.

You should be using an appropriate tool for the job at hand.

如果你想爬取一个站点并将页面保存到磁盘,Python 可能不是最好的选择.当有人需要某个功能时,开源项目就会获得该功能,而且由于 wget 做得很好,所以没有人费心去尝试编写一个 Python 库来替换它.

If you want to spider a site and save the pages to disk, Python probably isn't the best choice for that. Open source projects get features when someone needs that feature, and because wget does its job so well, nobody has bothered to try to write a python library to replace it.

考虑到 wget 几乎可以在任何具有 Python 解释器的平台上运行,您是否有理由不能使用 wget?

Considering wget runs on pretty much any platform that has a Python interpreter, is there a reason you can't use wget?

这篇关于相当于Python中的wget下载网站和资源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆