保存整个网页的C#代码? (使用图片/格式化) [英] C# code for saving an entire web page? (with images/formatting)

查看:322
本文介绍了保存整个网页的C#代码? (使用图片/格式化)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在苦苦寻找的一些C#代码exmample(我使用C# Visual Studio 2008中快速),可以编程方式保存整个网页的(因为URL),包括图像和格式(如CSS)。我们的目的是,在下一阶段我会船这一关(不知道怎么还),所以它可以在以后通过浏览器进行浏览。

I've been struggling to find an exmample of some C# code (I'm using C# Visual Studio 2008 Express) that can programmatically save an entire web page (given a URL) including the images and formatting (e.g. CSS). The intention is that in a subsequent phase I'd ship this off (not sure how yet) so it could be viewed later via a browser.

有一个例子最简单的方法(利用.NET框架的方法)来保存整个网页?保存为一个网页与图片,或以其他方式子目录。基本相同,当你说保存整个网页你的浏览器会得到什么。

Is there an example of the most simple approach (leveraging the .NET Framework methods) to save an entire web page? Saving as one page with a subdirectory for images, or otherwise. Basically the same as what you get with browsers when you say "save entire web page".

推荐答案

最简单的方法可能是添加 WebBrowser控件您的应用程序并在页面指向你想要使用保存导航()方法。

The simplest way is probably to add a WebBrowser Control to your application and point it at the page you want to save using the Navigate() method.

然后,当该文件已加载,调用 ShowSaveAsDialog方法。然后,用户可以在网页保存为单个文件,或在一个子目录中的图像文件中。

Then, when the document has loaded, call the ShowSaveAsDialog method. The user can then save the page as a single file, or a file with images in a subdirectory.

[更新]

现在既然已经在你的问题发现编程,因为它要求无论是用户的参与,或钻研Windows的API使用的SendKeys或类似的送输入上面的方法是不理想的。

Having now noticed "programatically" in your question, the above approach is not ideal as it requires either user involvement or delving into the Windows API to send input using SendKeys or similar.

有什么内置于做所有与你的要求.NET框架

There is nothing built-in to the .NET Framework that does all of what you ask.

所以我的办法修订是:


  • 使用 System.NET.HttpWebRequest 让主HTML文档作为字符串或流(易)。

  • 其中,您现在可以轻松查询加载到 HTMLAgilityPack 文件文档以获取所有图像元素的列表,样式表的链接,等等。

  • 然后,让一个单独的Web请求的每个文件并将其保存到一个子目录。

  • 最后更新主页的所有培训相关的链接指向的子目录中的项目。

  • Use System.NET.HttpWebRequest to get the main HTML document as a string or stream (easy).
  • Load this into a HTMLAgilityPack document where you can now easily query the document to get lists of all image elements, stylesheet links, etc.
  • Then make a separate web request for each of these files and save them to a subdirectory.
  • Finally update all relevent links in the main page to point to the items in the subdirectory.

在影响您会是实现一个非常简单的Web浏览器。你可以与使用JavaScript的网页来动态改变或请求的页面内容碰到的问题,但对于大多数页面,这将给可接受的结果。

In effect you would be implementing a very simple web browser. You may run into issues with pages that use JavaScript to dynamically alter or request page content, but for most pages this should give acceptable results.

这篇关于保存整个网页的C#代码? (使用图片/格式化)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆