Indy - IdHttp如何处理页面重定向? [英] Indy - IdHttp how to handle page redirects?

查看:824
本文介绍了Indy - IdHttp如何处理页面重定向?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用:Delphi 2010,最新版本的Indy

Using: Delphi 2010, latest version of Indy

我正在尝试从Googles Adsense网页上删除数据,目的是获取报告。但是迄今为止我还没有成功。它在第一个请求后停止,并且不会继续。

I am trying to scrape the data off Googles Adsense web page, with an aim to get the reports. However I have been unsuccessful so far. It stops after the first request and does not proceed.

使用Fiddler调试Google Adsense网站的流量/请求,以及一个Web浏览器来加载Adsense页面,I可以看到(从webbrowser)的请求生成一些重定向,直到页面被加载。

Using Fiddler to debug the traffic/requests to Google Adsense website, and a web browser to load the Adsense page, I can see that the request (from the webbrowser) generates a number of redirects until the page is loaded.

但是,我的Delphi应用程序只生成了几个请求

However, my Delphi application is only generating a couple of requests before it stops.

以下是我遵循的步骤:


  1. 删除IdHTTP和一个IdSSLIOHandlerSocketOpenSSL1组件。

  2. 将IdHTTP组件属性AllowCookies和HandleRedirects设置为True,将IOHandler属性设置为IdSSLIOHandlerSocketOpenSSL1。

  3. 设置IdSSLIOHandlerSocketOpenSSL1组件属性方法:='sslvSSLv23'

最后我有这个代码:

procedure TfmMain.GetUrlToFile(AURL, AFile : String);
var
 Output : TMemoryStream;
begin
  Output := TMemoryStream.Create;
  try
    IdHTTP1.Get(FURL, Output);
    Output.SaveToFile(AFile);
  finally
    Output.Free;
  end;
end;

但是,它没有达到预期的登录页面。我希望它的行为就像是一个webbrowser,并通过重定向进行,直到找到最后一页。

However, it does not get to the login page as expected. I would expect it to behave as if it was a webbrowser and proceed through the redirects until it finds the final page.

这是从Fiddler的标题的输出: / p>

This is the output of the headers from Fiddler:


HTTP/1.1 302 Found
Location: https://encrypted.google.com/
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=5166063f01b64b03:FF=0:TM=1293571783:LM=1293571783:S=a5OtsOqxu_GiV3d6; expires=Thu, 27-Dec-2012 21:29:43 GMT; path=/; domain=.google.com
Set-Cookie: NID=42=XFUwZdkyF0TJKmoJjqoGgYNtGyOz-Irvz7ivao2z0--pCBKPpAvCGUeaa5GXLneP41wlpse-yU5UuC57pBfMkv434t7XB1H68ET0ZgVDNEPNmIVEQRVj7AA1Lnvv2Aez; expires=Wed, 29-Jun-2011 21:29:43 GMT; path=/; domain=.google.com; HttpOnly
Date: Tue, 28 Dec 2010 21:29:43 GMT
Server: gws
Content-Length: 226
X-XSS-Protection: 1; mode=block


首先,这个输出有什么问题吗?

Firstly, is there anything wrong with this output?

有什么更多的东西,我应该做的是让IdHTTP组件继续追求重定向到最后一页?

Is there something more that I should do to get the IdHTTP component to keep pursuing the redirects until the final page?

推荐答案

发出呼叫前的IdHTTP组件属性值:

IdHTTP component property values prior to making the call:

    Name := 'IdHTTP1';
    IOHandler := IdSSLIOHandlerSocketOpenSSL1;
    AllowCookies := True;
    HandleRedirects := True;
    RedirectMaximum := 35;
    Request.UserAgent := 
      'Mozilla/5.0 (Windows NT 5.1; rv:2.0b8) Gecko/20100101 Firefox/4.' +
      '0b8';
    HTTPOptions := [hoForceEncodeParams];
    OnRedirect := IdHTTP1Redirect;
    CookieManager := IdCookieManager1;

重定向事件处理程序:

procedure TfmMain.IdHTTP1Redirect(Sender: TObject; var dest: string; var
    NumRedirect: Integer; var Handled: Boolean; var VMethod: string);
begin
   Handled := True;
end;

拨打电话:

  FURL := 'https://www.google.com';

  GetUrlToFile( (FURL + '/adsense/'), 'a.html');




  procedure TfmMain.GetUrlToFile(AURL, AFile : String);
  var
   Output : TMemoryStream;
  begin
    Output := TMemoryStream.Create;
    try
      try
       IdHTTP1.Get(AURL, Output);
       IdHTTP1.Disconnect;
      except

      end;
      Output.SaveToFile(AFile);
    finally
      Output.Free;
    end;
  end;









以下是来自Fiddler的(请求和响应标头)输出:

Here's the (request and response headers) output from Fiddler:

这篇关于Indy - IdHttp如何处理页面重定向?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆