使用C#或vb.net得到最终生成的HTML源代码 [英] Get the final generated html source using c# or vb.net
问题描述
,我怎么生成的HTML源代码?
要得到一个页面,我可以用下面这个,但这不会得到生成的源的HTML源代码,它将不包含任何被在浏览器中的JavaScript动态添加的HTML。我如何得到最终生成的HTML源代码?
感谢
的WebRequest REQ = WebRequest.Create(http://www.asp.net);
WebResponse类解析度= req.GetResponse();
StreamReader的SR =新的StreamReader(res.GetResponseStream());
字符串的html = sr.ReadToEnd();
如果我尝试这下面然后返回了JavaScript的code注射文档
公共类Form1的 昏暗WB作为web浏览器=无 私人小组Form1_Load的(发送者为对象,E作为EventArgs的)把手MyBase.Load WB =新的web浏览器()
Me.Controls.Add(WB)
AddHandler的WB.DocumentCompleted,AddressOf WebBrowser1_DocumentCompleted
WB.Navigate(mysite的/ Default.aspx的) 结束小组 私人小组WebBrowser1_DocumentCompleted(发送者为对象,E为WebBrowserDocumentCompletedEventArgs)
昏暗的HTML code的String = WebBrowser1.Document.Body.OuterHtml()
昏暗的译文]字符串= WB.DocumentText 结束小组
末级
返回的HTML
<!DOCTYPE HTML>< HTML的xmlns =http://www.w3.org/1999/xhtml>
<头=服务器>
<标题>< /标题>< /头>
<身体GT;
<表ID =form1的=服务器>
< DIV ID =center_text_panel>
//测试文本这段文字应该是这里
< / DIV>
< /表及GT;
< /身体GT;
< / HTML> <脚本类型=文/ JavaScript的> 的document.getElementById(center_text_panel)的innerText =测试文本。
< / SCRIPT>
您可以使用 WebKit.NET
看这里官方教程
这不仅可以抢源,还可以通过页面加载事件处理的JavaScript。
webKitBrowser1.Navigate(MyURL)
然后,处理DocumentCompleted事件和:
私人documentContent = webKitBrowser1.DocumentText
修改 - 这可能是更好的开源的WebKit选项:的 HTTP://$c$c.google.com/p/open-webkit-sharp/
using VB.net or c#, How do I get the generated HTML source?
To get the html source of a page I can use this below but this wont get the generated source, it won't contain any of the html that was added dynamically by the javascript in the browser. How do I get the the final generated HTML source?
thanks
WebRequest req = WebRequest.Create("http://www.asp.net");
WebResponse res = req.GetResponse();
StreamReader sr = new StreamReader(res.GetResponseStream());
string html = sr.ReadToEnd();
if I try this below then it returns the document with out the JavaScript code injected
Public Class Form1
Dim WB As WebBrowser = Nothing
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
WB = New WebBrowser()
Me.Controls.Add(WB)
AddHandler WB.DocumentCompleted, AddressOf WebBrowser1_DocumentCompleted
WB.Navigate("mysite/Default.aspx")
End Sub
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs)
'Dim htmlcode As String = WebBrowser1.Document.Body.OuterHtml()
Dim s As String = WB.DocumentText
End Sub
End Class
HTML returned
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
<title></title>
</head>
<body>
<form id="form1" runat="server">
<div id="center_text_panel">
//test text this text should be here
</div>
</form>
</body>
</html>
<script type="text/javascript">
document.getElementById("center_text_panel").innerText = "test text";
</script>
You can use WebKit.NET
Look here for official tutorials
This can not only grab the source, but also process javascript through the pageload event.
webKitBrowser1.Navigate(MyURL)
Then, handle the DocumentCompleted event, and:
private documentContent = webKitBrowser1.DocumentText
Edit - This might be the better open source WebKit option: http://code.google.com/p/open-webkit-sharp/
这篇关于使用C#或vb.net得到最终生成的HTML源代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!