在 C# 中使用 asp.net 表单登录屏幕抓取站点? [英] Screen-scraping a site with a asp.net form login in C#?
问题描述
是否可以为受表单登录保护的网站编写屏幕抓取工具.我当然可以访问该站点,但我不知道如何登录该站点并将我的凭据保存在 C# 中.
Would it be possible to write a screen-scraper for a website protected by a form login. I have access to the site, of course, but I have no idea how to login to the site and save my credentials in C#.
此外,我们将不胜感激任何在 C# 中使用屏幕抓取工具的好例子.
Also, any good examples of screenscrapers in C# would be hugely appreciated.
这已经完成了吗?
推荐答案
很简单.您需要自定义登录 (HttpPost) 方法.
It's pretty simple. You need your custom login (HttpPost) method.
你可以想出这样的东西(这样你会在登录后获得所有需要的cookies,你只需要将它们传递给下一个HttpWebRequest):
You can come up with something like this (in this way you will get all needed cookies after login, and you need just to pass them to the next HttpWebRequest):
public static HttpWebResponse HttpPost(String url, String referer, String userAgent, ref CookieCollection cookies, String postData, out WebHeaderCollection headers, WebProxy proxy)
{
try
{
HttpWebRequest http = WebRequest.Create(url) as HttpWebRequest;
http.Proxy = proxy;
http.AllowAutoRedirect = true;
http.Method = "POST";
http.ContentType = "application/x-www-form-urlencoded";
http.UserAgent = userAgent;
http.CookieContainer = new CookieContainer();
http.CookieContainer.Add(cookies);
http.Referer = referer;
byte[] dataBytes = UTF8Encoding.UTF8.GetBytes(postData);
http.ContentLength = dataBytes.Length;
using (Stream postStream = http.GetRequestStream())
{
postStream.Write(dataBytes, 0, dataBytes.Length);
}
HttpWebResponse httpResponse = http.GetResponse() as HttpWebResponse;
headers = http.Headers;
cookies.Add(httpResponse.Cookies);
return httpResponse;
}
catch { }
headers = null;
return null;
}
这篇关于在 C# 中使用 asp.net 表单登录屏幕抓取站点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!