在linux服务器上保存完整网页的最好方法是什么? [英] What's the best way to save a complete webpage on a linux server?

查看:171
本文介绍了在linux服务器上保存完整网页的最好方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在我的linux服务器上归档完整的页面,包括任何链接的图像等。寻找最好的解决方案。有没有办法保存所有资源,然后重新链接它们在同一目录中工作?



我曾经考虑过使用curl,但我不确定如何做所有这一切。



有没有办法在服务器上使用firefox并在地址加载或类似之后复制临时文件? / p>

任何输入欢迎。



编辑:



似乎wget是'不'工作,因为文件需要被渲染。我有firefox安装在服务器上,有没有办法加载在firefox的url,然后抓取临时文件和清除临时文件后?

解决方案

wget 可以执行此操作,例如:

  wget -r http://example.com/ 

这将镜像整个example.com网站



一些有趣的选项是:



-Dexample.com :不跟随其他网域的链接

- html-extension :将text / html content-type的网页重命名为.html



手动: http:// www。 gnu.org/software/wget/manual/


I need to archive complete pages including any linked images etc. on my linux server. Looking for the best solution. Is there a way to save all assets and then relink them all to work in the same directory?

I've thought about using curl, but I'm unsure of how to do all of this. Also, will I maybe need PHP-DOM?

Is there a way to use firefox on the server and copy the temp files after the address has been loaded or similar?

Any and all input welcome.

Edit:

It seems as though wget is 'not' going to work as the files need to be rendered. I have firefox installed on the server, is there a way to load the url in firefox and then grab the temp files and clear the temp files after?

解决方案

wget can do that, for example:

wget -r http://example.com/

This will mirror the whole example.com site.

Some interesting options are:

-Dexample.com: do not follow links of other domains
--html-extension: renames pages with text/html content-type to .html

Manual: http://www.gnu.org/software/wget/manual/

这篇关于在linux服务器上保存完整网页的最好方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆