如何从C#获取原始页面源代码(不生成源代码) [英] How to get raw page source (not generated source) from c#
问题描述
我们的目标是获得页面的原始来源,我的意思是不要运行脚本或让浏览器格式化页面。例如:假设来源为< table>< tr>< / table>
,我不希望获得< ; table>< tbody>< tr>< / tr>< / tbody>< / table>
,如何通过c#代码来做到这一点?
更多信息:例如,在浏览器的地址栏中输入view-source:http://feeds.gawker.com/kotaku/full会给出一个xml文件,但如果你只是调用 http://feeds.gawker.com/kotaku/full它会呈现一个html页面,我想要的是xml文件。希望这是明确的。
这是一种方法,但它并不十分清楚您实际需要什么。
$ b
使用(var wc = new WebClient())
{
var source = wc.DownloadString(http:// google .COM);
}
The goal is to get the raw source of the page, I mean do not run the scripts or let the browsers format the page at all. for example: suppose the source is <table><tr></table>
after the response, I don't want get <table><tbody><tr></tr></tbody></table>
, how to do this via c# code?
More info: for example, type "view-source:http://feeds.gawker.com/kotaku/full" in the browser's address bar will give u a xml file, but if you just call "http://feeds.gawker.com/kotaku/full" it will render a html page, what I want is the xml file. hope this is clear.
Here's one way, but it's not really clear what you actually want.
using(var wc = new WebClient())
{
var source = wc.DownloadString("http://google.com");
}
这篇关于如何从C#获取原始页面源代码(不生成源代码)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!