用于登录到网站编程技术 [英] Techniques for logging into websites programmatically

查看:167
本文介绍了用于登录到网站编程技术的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图自动登录到对的photobucket使用API​​对于需要使用存储的凭据自动下载照片的一个项目。

I am trying to automate logging into Photobucket for API use for a project that requires automated photo downloading using stored credentials.

该API生成一个URL用于登录和使用Firebug我可以看到请求和响应被发送/接收的。

The API generates a URL to use for logging in, and using Firebug i can see what requests and responses are being sent/received.

我的问题是,我该如何使用HttpWebRequest和HttpWebResponse模仿在浏览器中发生的事情在C#中?

My question is, how can i use HttpWebRequest and HttpWebResponse to mimic what happens in the browser in C#?

有没有可能使用网络浏览器组件一个C#应用程序内部,填充用户名和密码字段,并提交登录?

Would it be possible to use a web browser component inside a C# app, populate the username and password fields and submit the login?

推荐答案

我以前做过这种事,结束了一个很好的工具包写这些类型的应用程序。我用这个工具来处理非平凡回正来回web请求,所以这是完全可能的,而不是非常困难。

I've done this kind of thing before, and ended up with a nice toolkit for writing these types of applications. I've used this toolkit to handle non-trivial back-n-forth web requests, so it's entirely possible, and not extremely difficult.

我很快发现,做的HttpWebRequest / HttpWebResponse 从无到有确实是较低级别比我想是处理。我的工具是基于完全围绕着 HtmlAgilityPack 通过西蒙Mourier。这是一个很好的工具集。它做了很多繁重你,使取出的HTML的的解析真正的容易。如果你可以摇滚XPath查询时,HtmlAgilityPack是你要开始。它处理不好foormed HTML相当不错呢!

I found out quickly that doing the HttpWebRequest/HttpWebResponse from scratch really was lower-level than I wanted to be dealing with. My tools are based entirely around the HtmlAgilityPack by Simon Mourier. It's an excellent toolset. It does a lot of the heavy lifting for you, and makes parsing of the fetched HTML really easy. If you can rock XPath queries, the HtmlAgilityPack is where you want to start. It handles poorly foormed HTML quite well too!

您还需要一个很好的工具来帮助调试。再说你有你的调试器是什么,能够检查HTTP / HTTPS流量,因为它可以追溯到正来回跨线是无价的。既然你是code将是使这些请求,而不是你的浏览器,Firebug是不会有很大帮助调试code。还有的数据包嗅探器工具种种,但对HTTP / HTTPS调试,我不认为你可以击败易用性和力量的提琴手2 。最新版本甚至带有一个插件在Firefox快速疏导通过提琴手和背部的请求。因为它也可以作为一个无缝的HTTPS代理,可以检查HTTPS流量为好。

You still need a good tool to help debug. Besides what you have in your debugger, being able to inspect the http/https traffic as it goes back-n-forth across the wire is priceless. Since you're code is going to be making these requests, not your browser, FireBug isn't going to be of much help debugging your code. There's all sorts of packet sniffer tools, but for HTTP/HTTPS debugging, I don't think you can beat the ease of use and power of Fiddler 2. The newest version even comes with a plugin for firefox to quickly divert requests through fiddler and back. Because it can also act as a seamless HTTPS proxy you can inspect your HTTPS traffic as well.

给他们一个尝试,我相信他们会是两个不可缺少的工具,在你的黑客。

Give 'em a try, I'm sure they'll be two indispensable tools in your hacking.

更新:添加了如下code例子。这是从一个不大得多的会话即登录到一个网站,并保持相关的cookie的保持你拉班。我选择这个,因为它不是一个简单的多,请获取该网页为我'code,再加上它有一个线或两对最终目标网页的XPath查询的。

Update: Added the below code example. This is pulled from a not-much-larger "Session" class that logs into a website and keeps a hold of the related cookies for you. I choose this because it does more than a simple 'please fetch that web page for me' code, plus it has a line-or-two of XPath querying against the final destination page.

public bool Connect() {
   if (string.IsNullOrEmpty(_Username)) { base.ThrowHelper(new SessionException("Username not specified.")); } 
   if (string.IsNullOrEmpty(_Password)) { base.ThrowHelper(new SessionException("Password not specified.")); }

   _Cookies = new CookieContainer();
   HtmlWeb webFetcher = new HtmlWeb();
   webFetcher.UsingCache = false;
   webFetcher.UseCookies = true;

   HtmlWeb.PreRequestHandler justSetCookies = delegate(HttpWebRequest webRequest) {
      SetRequestHeaders(webRequest, false);
      return true;
   };
   HtmlWeb.PreRequestHandler postLoginInformation = delegate(HttpWebRequest webRequest) {
      SetRequestHeaders(webRequest, false);

      // before we let webGrabber get the response from the server, we must POST the login form's data
      // This posted form data is *VERY* specific to the web site in question, and it must be exactly right,
      // and exactly what the remote server is expecting, otherwise it will not work!
      //
      // You need to use an HTTP proxy/debugger such as Fiddler in order to adequately inspect the 
      // posted form data. 
      ASCIIEncoding encoding = new ASCIIEncoding();
      string postDataString = string.Format("edit%5Bname%5D={0}&edit%5Bpass%5D={1}&edit%5Bform_id%5D=user_login&op=Log+in", _Username, _Password);
      byte[] postData = encoding.GetBytes(postDataString);
      webRequest.ContentType = "application/x-www-form-urlencoded";
      webRequest.ContentLength = postData.Length;
      webRequest.Referer = Util.MakeUrlCore("/user"); // builds a proper-for-this-website referer string

      using (Stream postStream = webRequest.GetRequestStream()) {
         postStream.Write(postData, 0, postData.Length);
         postStream.Close();
      }

      return true;
   };

   string loginUrl = Util.GetUrlCore(ProjectUrl.Login); 
   bool atEndOfRedirects = false;
   string method = "POST";
   webFetcher.PreRequest = postLoginInformation;

   // this is trimmed...this was trimmed in order to handle one of those 'interesting' 
   // login processes...
   webFetcher.PostResponse = delegate(HttpWebRequest webRequest, HttpWebResponse response) {
      if (response.StatusCode == HttpStatusCode.Found) {
         // the login process is forwarding us on...update the URL to move to...
         loginUrl = response.Headers["Location"] as String;
         method = "GET";
         webFetcher.PreRequest = justSetCookies; // we only need to post cookies now, not all the login info
      } else {
         atEndOfRedirects = true;
      }

      foreach (Cookie cookie in response.Cookies) {
         // *snip*
      }
   };

   // Real work starts here:
   HtmlDocument retrievedDocument = null;
   while (!atEndOfRedirects) {
      retrievedDocument = webFetcher.Load(loginUrl, method);
   }


   // ok, we're fully logged in.  Check the returned HTML to see if we're sitting at an error page, or
   // if we're successfully logged in.
   if (retrievedDocument != null) {
      HtmlNode errorNode = retrievedDocument.DocumentNode.SelectSingleNode("//div[contains(@class, 'error')]");
      if (errorNode != null) { return false; }
   }

   return true; 
}


public void SetRequestHeaders(HttpWebRequest webRequest) { SetRequestHeaders(webRequest, true); }
public void SetRequestHeaders(HttpWebRequest webRequest, bool allowAutoRedirect) {
   try {
      webRequest.AllowAutoRedirect = allowAutoRedirect;
      webRequest.CookieContainer = _Cookies;

      // the rest of this stuff is just to try and make our request *look* like FireFox. 
      webRequest.UserAgent = @"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3";
      webRequest.Accept = @"text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
      webRequest.KeepAlive = true;
      webRequest.Headers.Add(@"Accept-Language: en-us,en;q=0.5");
      //webRequest.Headers.Add(@"Accept-Encoding: gzip,deflate");
   }
   catch (Exception ex) { base.ThrowHelper(ex); }
}

这篇关于用于登录到网站编程技术的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆