获取是通过AJAX WebClient中生成html [英] Get html that is generated via AJAX in webclient

查看:137
本文介绍了获取是通过AJAX WebClient中生成html的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常去现场看的东西了。我心想:?等一下我可以计划我为什么要到这个网站时手动,我可以写的软件,它会为我。

I often go to a site to look stuff up. I thought to myself: "Hold on. I can program. Why am I going to this site manually when I can write a piece of software that does it for me?".

于是我开始了。我使用C#,所以我发现Web客户端和开放的。

And so I started. I'm using C#, so I found WebClient and Uri.

我已经设法获得源$ C ​​$ C的网站,但发生问题的具体数据我在寻找通过AJAX生成,之后源$ C ​​$ C加载。

I've managed to get the source code for the site, yet the problem occurred that the specific data I'm looking for is generated via AJAX, after the source code has loaded.

所以这就是我的问题。我怎么可以得到code,如果它需要通过一个AJAX调用请求第一?

So that's my problem. How can I get that code, if it needs to be requested via an AJAX call first?

推荐答案

一般的处理方法是这样的:

The general approach is this:

  1. 在使用像提琴手的工具,找出哪些HTTP请求的浏览器,以获取数据,你做要找的。
  2. 使用 Web客户端获取HTTP请求(S),你所需要的。
  1. using a tool like Fiddler, find out which HTTP requests are made by the browser in order to fetch the data you're looking for.
  2. use WebClient to fetch the HTTP request(s) you need.

看看我的回答<一href="http://stackoverflow.com/questions/1471062/c-webclient-view-source-question/1521292#1521292">this问题关于HTML屏幕抓取的更多细节和如何解决各种问题,您可以运行在更多的信息。

Take a look at my answer to this question for more info about HTML screen scraping for more details and how to work around various issues you may run across.

有关#1以上,这里是如何使用Fiddler了解一个特定的请求正在取得:

For #1 above, here's how to use fiddler to understand how a specific request is being made:

首先,找到你所关心的(其中包含要在其响应数据的请求)的请求。您可以通过在提琴手双击左侧窗格中寻找文字fiew选项卡里面的右下方的窗格检查每个请求执行此操作。您还可以使用CTRL + F查找跨多个请求的内容,但有些要求是COM pressed所以你要确保autode code按钮在工具栏中选择让您的要求,如果你之前要确保可以在所有他们的文本搜索。

First, find the request you care about (the request which contains the data you want in its response). You can do this by inspecting each request by double-clicking it on the left pane in fiddler and looking inside the "text fiew" tab on the lower-right pane. You can also use CTRL+F to find content across multiple requests, but some requests are compressed so you'll want to ensure the "autodecode" button is selected in the toolbar before making your requests if you want to be sure you can text-search across all of them.

一旦你找到你想要的要求,双击它的提琴手,然后在右上窗格中的头文件选项卡。这些都是正在发送的报头。如果您的客户端发送正是这些头到服务器,你应该得到相同的数据。但通常并不需要所有这头,所以你要弄清楚哪些是需要的。您可以在右上方的窗格中执行此操作使用招的请求生成器选项卡。选择该选项卡,并在从左侧窗格中拖动数据请求到该请求生成器。然后提交验证它返回正确的结果的请求。然后开始删除头,一次一个头,直到请求停止working--你知道这头是必要的。尝试删除每个头,直到找到所需的人。

Once you've found the request you want, double-click it in Fiddler and select the "headers" tab in the upper-right pane. Those are the headers being sent. If your client sends exactly these headers to the server, you should get back the same data. But usually not all the headers are needed, so you'll want to figure out which ones are needed. You do this using Fiddler's Request Builder tab in the upper-right pane. Select that tab and drag your data request over from the left pane onto the request builder. Then submit the request to validate that it returns the correct results. Then start deleting headers, one header at a time, until the request stops working-- you know that that header was required. Try to delete each header until you find the ones that are required.

然后,你需要写code生成正确的头。不要担心主持人:头,这就是自动为您生成。对于的Cookie:的头,你需要使用的CookieContainer 类来生成它。对于其他头文件(例如用户代理:,接受:等你通常可以复制它们,并将它们添加到您的要求原样。

Then, you'll need to write code to generate the right header. Don't worry about the Host: header, that's generated automatically for you. For the Cookie: header, you'll need to generate it using the CookieContainer class. For the other headers (e.g. UserAgent:, Accept:, etc. you can generally copy them and add them to your request as-is.

这篇关于获取是通过AJAX WebClient中生成html的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆