在Linux服务器上保存完整网页的最佳方法是什么? [英] What's the best way to save a complete webpage on a linux server?
问题描述
我需要在Linux服务器上归档完整的页面,包括所有链接的图像等.寻找最佳解决方案.有没有一种方法可以保存所有资产,然后将它们重新链接以在同一目录中工作?
I need to archive complete pages including any linked images etc. on my linux server. Looking for the best solution. Is there a way to save all assets and then relink them all to work in the same directory?
我已经考虑过使用curl了,但是我不确定如何做所有这些事情.另外,我可能需要PHP-DOM吗?
I've thought about using curl, but I'm unsure of how to do all of this. Also, will I maybe need PHP-DOM?
是否可以使用服务器上的firefox并在地址已加载或类似地址后复制临时文件?
Is there a way to use firefox on the server and copy the temp files after the address has been loaded or similar?
任何人都欢迎输入.
由于需要渲染文件,因此wget似乎无法正常工作.我在服务器上安装了firefox,是否有办法在firefox中加载url,然后获取临时文件并在之后清除临时文件?
It seems as though wget is 'not' going to work as the files need to be rendered. I have firefox installed on the server, is there a way to load the url in firefox and then grab the temp files and clear the temp files after?
推荐答案
wget
可以做到这一点,例如:
wget
can do that, for example:
wget -r http://example.com/
这将反映整个example.com网站.
This will mirror the whole example.com site.
一些有趣的选项是:
-Dexample.com
:不关注其他域的链接
--html-extension
:将具有text/html内容类型的页面重命名为.html
-Dexample.com
: do not follow links of other domains
--html-extension
: renames pages with text/html content-type to .html
手册: http://www.gnu.org/software/wget/manual/
这篇关于在Linux服务器上保存完整网页的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!