如何下载完整的网站? [英] How to download a full website?

查看：43 发布时间：2021/9/6 18:41:58 testing automation wget qa web-testing

本文介绍了如何下载完整的网站?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

修复网站代码使用CDN后(将所有url重写为图片，js & css)，我需要测试域上的所有页面，以确保所有资源都是从 CDN 获取的.

After fixing the code of a website to use a CDN (rewriting all the urls to images, js & css), I need to test all the pages on the domain to make sure all the resources are fetched from the CDN.

所有网站页面都可以通过链接访问，没有孤立的页面.

All the sites pages are accessible through links, no isolated pages.

目前我正在使用 FireBug 并检查网络"视图...

Currently I'm using FireBug and checking the "Net" view...

有没有什么自动化的方法可以给一个域名并请求该域的所有页面+资源?

Is there some automated way to give a domain name and request all pages + resources of the domain?

更新:

好的，我发现我可以这样使用 wget:

OK, I found I can use wget as so:

wget -p --no-cache -e robots=off -m -H -D cdn.domain.com,www.domain.com -o site1.log www.domain.com

选项说明:

-p - 也下载资源(图片、css、js 等)
--no-cache - 获取真实对象，不返回服务器缓存对象
-e robots=off - 忽略 robots 和 no-follow 方向
-m - 镜像站点(点击链接)
-H - 跨主机(也跟随其他域)
-D cdn.domain.com,www.domain.com - 指定要关注的女域名，否则将关注页面中的每个链接
-o site1.log - 记录到文件 site1.log
-U "Mozilla/5.0" - 可选:伪造用户代理 - 如果服务器为不同的浏览器返回不同的数据，则很有用
www.domain.com - 要下载的站点

-p - download resources too (images, css, js, etc.)
--no-cache - get the real object, do not return server cached object
-e robots=off - disregard robots and no-follow directions
-m - mirror site (follow links)
-H - span hosts (follow other domains too)
-D cdn.domain.com,www.domain.com - specify witch domains to follow, otherwise will follow every link from the page
-o site1.log - log to file site1.log
-U "Mozilla/5.0" - optional: fake the user agent - useful if server returns different data for different browser
www.domain.com - the site to download

享受吧！

推荐答案

wget 文档中包含以下内容:

The wget documentation has this bit in it:

实际上，要下载单个页面及其所有必需品(即使它们存在于不同的网站上)，并确保该批次显示在本地正确，作者喜欢另外使用一些选项到‘-p’:

Actually, to download a single page and all its requisites (even if they exist on separate websites), and make sure the lot displays properly locally, this author likes to use a few options in addition to ‘-p’:

      wget -E -H -k -K -p http://site/document

关键是-H选项，意思是--span-hosts ->递归时转到外部主机.我不知道这是否也代表普通超链接或仅代表资源，但您应该尝试一下.

The key is the -H option, which means --span-hosts -> go to foreign hosts when recursive. I don't know if this also stands for normal hyperlinks or only for resources, but you should try it out.

您可以考虑另一种策略.您不需要下载资源来测试它们是否从 CDN 引用.您只需获取您感兴趣的页面的源代码(您可以像以前一样使用 wget 或 curl 或其他东西)，并且:

You can consider an alternate strategy. You don't need to download the resources to test that they are referenced from the CDN. You can just get the source code for the pages you're interested in (you can use wget, as you did, or curl, or something else) and either:

使用库解析它 - 哪个取决于您用于编写脚本的语言.检查每个、和


        
            相关文章
            
                    
                        
                            iPhone：如何下载完整的网站？;
                        
                    
                    
                        
                            如何防止网站下载，让别人无法下载我的完整网站;
                        
                    
                    
                        
                            如何通过URL下载完整的网站或网页？;
                        
                    
                    
                        
                            不完整的网站开放/下载（Boomkat）。;
                        
                    
                    
                        
                            如何从资产加载完整的网站?;
                        
                    
                    
                        
                            完整网站的参数管理！;
                        
                    
                    
                        
                            如何下载整个网站的前端;
                        
                    
                    
                        
                            如何使用Python脚本下载完整的网页?;
                        
                    
                    
                        
                            如何使用 Python 脚本下载完整的网页?;
                        
                    
                    
                        
                            如何使用 PowerShell 下载完整的存储库?;
                        
                    
                    
                        
                            如何使用Python脚本下载完整的网页?;
                        
                    
                    
                        
                            Firefox插件下载完整的网页？;
                        
                    
                    
                        
                            获取完整的XML下载数据;
                        
                    
                    
                        
                            无法下载完整的Android SDK;
                        
                    
                    
                        
                            需要重写网站的完整网址;
                        
                    
                    
                        
                            创建下载网站;
                        
                    
                    
                        
                            下载完整性指示;
                        
                    
                    
                        
                            某些网站如何下载 YouTube 字幕?;
                        
                    
                    
                        
                            下载网站的所有图片;
                        
                    
                    
                        
                            启用网站中的下载;
                        
                    
                    
                        
                            不完整下载的文件处理;
                        
                    
                    
                        
                            WebClient 下载不完整的文件;
                        
                    
                    
                        
                            如何在ASP.NET中下载完整的网页;
                        
                    
                    
                        
                            如何使用 apache 日志检查完整的文件下载;
                        
                    
                    
                        
                            使用scrapy下载完整页面;


    
        
            其他开发最新文章
            
                    
                        
                            拒绝显示一个框架，因为它将'X-Frame-Options'设置为'sameorigin';
                        
                    
                    
                        
                            什么是＆QUOT; AW＆QUOT;在部分标志属性是什么意思？;
                        
                    
                    
                        
                            在运行npm install命令时获取'npm WARN弃用'警告;
                        
                    
                    
                        
                            cmake无法找到openssl;
                        
                    
                    
                        
                            从Spark的scala中的* .tar.gz压缩文件中读取HDF5文件;
                        
                    
                    
                        
                            Twitter :: Error :: Forbidden  - 无法验证您的凭据;
                        
                    
                    
                        
                            我什么时候需要一个fb：app_id或者fb：admins？;
                        
                    
                    
                        
                            将.db文件导入R;
                        
                    
                    
                        
                            npm通知创建一个lockfile作为package-lock.json。你应该提交这个文件;
                        
                    
                    
                        
                            拒绝执行内联脚本，因为它违反了以下内容安全策略指令：“script-src'self'”;
                        
                    
            
        
        
            
                热门教程
            
            
                
                    
                        Java教程
                    
                
                
                    
                        Apache ANT 教程
                    
                
                
                    
                        Kali Linux教程
                    
                
                
                    
                        JavaScript教程
                    
                
                
                    
                        JavaFx教程
                    
                
                
                    
                        MFC 教程
                    
                
                
                    
                        Apache HTTP客户端教程
                    
                
                
                    
                        Microsoft Visio 教程
                    
                
            
        
        
            
                热门工具
            
            
                
                
                    
                        Java 在线工具
                    
                
                
                    
                        C(GCC) 在线工具
                    
                
                
                    
                        PHP 在线工具
                    
                
                
                    
                        C# 在线工具
                    
                
                
                    
                        Python 在线工具
                    
                
                
                    
                        MySQL 在线工具
                    
                
                
                    
                        VB.NET 在线工具
                    
                
                
                    
                        Lua 在线工具
                    
                
                
                    
                        Oracle 在线工具
                    
                
                
                    
                        C++(GCC) 在线工具
                    
                
                
                    
                        Go 在线工具
                    
                
                
                    
                        Fortran 在线工具

如何下载完整的网站? [英] How to download a full website?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭