在javascript和ajax调用之后获取HTML源代码 [英] Get HTML source code after javascript and ajax calls

查看:62
本文介绍了在javascript和ajax调用之后获取HTML源代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



当我将浏览器(例如YouTube)中网页的源代码与从以下代码中获取的源代码进行比较时,会有所不同.源代码不同,我怀疑这是由于某些DOM操作引起的.

Hi,

When I compare the source code of a webpage in the browser (e.g. YouTube) with the source code I get from the code below, it differs. The source code is not the same and I suspect that is caused because of some DOM manipulations.

<br />
var webGet = new HtmlWeb();<br />
HtmlAgilityPack.HtmlDocument document = webGet.Load(_url);<br />


是否可以通过以编程方式(使用C#)对javascript和/或ajax进行操作后获得HTML源代码?

在此先感谢!


Is it possible to get the HTML source code after the javascript and/or ajax manipulations programmaticly (with C#)?

Thanks in advance!

推荐答案

是,不是.在HttpWebResponse中已经从HTTP服务器传送了文档之后,这些DOM操作完全在客户端完成.因此,如果仅从服务器下载HTML文件(使用HttpWebRequest),则只能在操作DOM之前按原样获取文档.

所以,你可以做什么?您可以像Web浏览器一样在客户端上重现所有这些操作.为此,您可以使用System.Windows.Forms.WebBrowser导航到网页.您甚至可以使用此类的实例自己操作DOM.请参阅System.Windows.Forms.WebBrowser.Document,System.Windows.Forms.WebBrowser.DocumentText,事件System.Windows.Forms.WebBrowser.NavigatedSystem.Windows.Forms.WebBrowser.DocumentCompleted" > http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.aspx [
Yes and no. Those DOM manipulations are done purely on the client side after the document is already delivered from the HTTP server in HttpWebResponse. So, if you only download the HTML file from the server (using HttpWebRequest) you can only get the document as it is before it''s DOM is manipulated.

So, what can you do? You can reproduce all those manipulations on the client side as the Web browser does. For this purpose, you can navigate to the Web page using System.Windows.Forms.WebBrowser. You can even manipulate DOM yourself using the instance of this class. See System.Windows.Forms.WebBrowser.Document, System.Windows.Forms.WebBrowser.DocumentText, the events System.Windows.Forms.WebBrowser.Navigated, System.Windows.Forms.WebBrowser.DocumentCompleted in http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.aspx[^].

—SA


这篇关于在javascript和ajax调用之后获取HTML源代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆