为什么我的urlFetchApp函数无法成功登录 [英] Why is my urlFetchApp function failing to successfully login

查看:310
本文介绍了为什么我的urlFetchApp函数无法成功登录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正尝试使用google apps脚本登录到ASP.Net网站,并抓取一些我通常必须手动检索的数据。我已经使用Chrome开发人员工具来获取正确的有效负载名称(TEXT_Username,TEXT_Password,_VIEWSTATE,_VIEWSTATEGENERATOR),我还得到了一个ASP网络会话ID以及我的Post请求。

I'm trying to use google apps script to login to an ASP.Net website and scrape some data that I typically have to retrieve manually. I've used Chrome Developer tools to get the correct payload names (TEXT_Username, TEXT_Password, _VIEWSTATE, _VIEWSTATEGENERATOR), I also got a ASP Net session Id to send along with my Post request.

当我运行我的函数时,如果followRedirects设置为false,则返回Response Code = 200,如果followRedirects设置为true,则返回Response Code = 302。不幸的是,这两种功能都不能成功验证网站。而是返回的HTML是登录页面。

When I run my function(s) it returns a Response Code = 200 if followRedirects is set to false and returns Response Code = 302 if followRedirects is set to true. Unfortunately in neither case do the functions successfully authenticate the website. Instead the HTML returned is that of the Login Page.

我尝试了不同的标题变体和参数,但似乎无法成功登录。

I've tried different header variants and parameters, but I can't seem to successfully login.

其他点的几个。当我使用开发人员工具在Chrome中进行登录时,响应代码显示为302 Found。

Couple of other points. When I do the login in Chrome using the Developer tools, the response code appears to be 302 Found.

有没有人对我如何成功登录此网站有任何建议。你有没有看到我的功能中可能导致我的问题的任何错误。我接受任何和所有建议。

Does anyone have any suggestions on how I can successfully login to this site. Do you see any errors in my functions that could be the cause of my problems. I'm open to any and all suggestions.

我的GAS功能如下:

My GAS functions follow:

    function login(cookie, viewState,viewStateGenerator) {
    var payload =
       {
         "__VIEWSTATE" : viewState,
         "__VIEWSTATEGENERATOR" : viewStateGenerator,
         "TEXT_Username" : "myUserName",
         "TEXT_Password" : "myPassword",
       };
    var header = {'Cookie':cookie};
    Logger.log(header);  
      var options =
       {
         "method" : "post",
         "payload" : payload,
         "followRedirects" : false,
         "headers" : header
       };
      var browser = UrlFetchApp.fetch("http://tnetwork.trakus.com/tnet/Login.aspx?" , options);
      Utilities.sleep(1000);
      var html = browser.getContentText();
      var response = browser.getResponseCode();
      var cookie2 = browser.getAllHeaders()['Set-Cookie'];
      Logger.log(response);
      Logger.log(html);

      }

    function loginPage() {
      var options =
       {
         "method" : "get",
         "followRedirects" : false,
       };
      var browser = UrlFetchApp.fetch("http://tnetwork.trakus.com/tnet/Login.aspx?" , options);
      var html = browser.getContentText();
     // Utilities.sleep(500);
      var response = browser.getResponseCode();
      var cookie = browser.getAllHeaders()['Set-Cookie'];
      login(cookie);
       var regExpGen = new RegExp("<input type=\"hidden\" name=\"__VIEWSTATEGENERATOR\" id=\"__VIEWSTATEGENERATOR\" value=\"(.*)\" \/>");
     var viewStateGenerator = regExpGen.exec(html)[1];
     var regExpView = new RegExp("<input type=\"hidden\" name=\"__VIEWSTATE\" id=\"__VIEWSTATE\" value=\"(.*)\" \/>");
    var viewState = regExpView.exec(html)[1];
    var response = login(cookie,viewState,viewStateGenerator);
  return response
      }

我通过运行loginPage()功能。该函数获取cookie(会话ID),然后调用登录函数并传递会话ID(cookie)。

I call the script by running the loginPage() function. This function obtains the cookie (session id) and then calls the login function and passes along the session id (cookie).

以下是我使用Google Chrome浏览器登录Google Developer tools Network部分时看到的内容:

Here is what I see in the Google Developer tools Network section when I login using Google's Chrome browser:

    Remote Address: 66.92.89.141:80
    Request URL: http://tnetwork.trakus.com/tnet/Login.aspx
    Request Method: POST
    Status Code:302 Found

    **Request Headers** view source
      Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Accept-Encoding:gzip, deflate
    Accept-Language: en-US,en;q=0.8
    Cache-Control:max-age=0
    Connection:keep-alive
    Content-Length: 252
    Content-Type:application/x-www-form-urlencoded
    Cookie: ASP.NET_SessionId=jayaejut5hopr43xkp0vhzu4; userCredentials=username=myUsername; .ASPXAUTH=A54B65A54A850901437E07D8C6856B7799CAF84C1880EEC530074509ADCF40456FE04EC9A4E47D1D359C1645006B29C8A0A7D2198AA1E225C636E7DC24C9DA46072DE003EFC24B9FF2941755F2F290DC1037BB2B289241A0E30AF5CB736E6E1A7AF52630D8B31318A36A4017893452B29216DCF2; __utma=260442568.1595796669.1421539534.1425211879.1425214489.16; __utmc=260442568; __utmz=260442568.1421539534.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=190106350.1735963725.1421539540.1425152706.1425212185.18; __utmc=190106350; __utmz=190106350.1421539540.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
    Host:tnetwork.trakus.com
    Origin:http://tnetwork.trakus.com
    Referer:http://tnetwork.trakus.com/tnet/Login.aspx?
    User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36

    **Form Dataview** sourceview URL encoded
__VIEWSTATE: O7YCnq5e471jHLqfPre/YW+dxYxyhoQ/VetOBeA1hqMubTAAUfn+j9HDyVeEgfAdHMl+2DG/9Gw2vAGWYvU97gml5OXiR9E/9ReDaw9EaQg836nBvMMIjE4lVfU=
    __VIEWSTATEGENERATOR:F4425990
    TEXT_Username:myUsername
    TEXT_Password:myPassword
    BUTTON_Submit: Log In

更新:看起来网站正在使用一个HttpOnly cookie。因此,我不认为我捕获了整个cookie,因此我的标题不正确。因此,我相信我需要将followRedirects设置为false,并手动处理重定向和cookie。我目前正在研究这个过程,但欢迎来自任何一直沿着这条道路走下去的人的意见。

Update: It appears that the website is using an HttpOnly cookie. As a result, I don't think I am capturing the whole cookie and therefore my header is not correct. As a result, I believe I need to set followRedirects to false and handle the redirect and cookie manually. I'm currently researching this process, but welcome input from anyone who has been down this road.

推荐答案

我终于能够成功登录到页面。这个问题似乎是urlFetchApp无法遵循重定向。我相信这个stackoverflow帖子:如何获取WordPress的管理页面使用谷歌应用程序脚本

I was finally able to successfully login to the page. The issue seems to be that the urlFetchApp was unable to follow the redirect. I credit this stackoverflow post: how to fetch a wordpress admin page using google apps script

这篇文章描述了以下过程,导致我成功登录:

This post described the following process that led to my successful login:


  1. 将followRedirect设置为false

  2. 提交帖子并捕获cookie

  3. 使用捕获的cookie发出get适当的网址。

以下是相关的代码:

Here is the relevant code:

var url = "http://myUrl.com/;
   var options = {
      "method": "post",
      "payload": {
      "TEXT_Username" : "myUserName",
      "TEXT_Password" : "myPassword",
      "BUTTON_Submit" : "Log In",
      },
      "testcookie": 1,
      "followRedirects": false
   };
   var response = UrlFetchApp.fetch(url, options);
   if ( response.getResponseCode() == 200 ) {
     // Incorrect user/pass combo
   } else if ( response.getResponseCode() == 302 ) {
     // Logged-in
     var headers = response.getAllHeaders();
     if ( typeof headers['Set-Cookie'] !== 'undefined' ) {
        // Make sure that we are working with an array of cookies
        var cookies = typeof headers['Set-Cookie'] == 'string' ? [ headers['Set-Cookie'] ] : headers['Set-Cookie'];
        for (var i = 0; i < cookies.length; i++) {
           // We only need the cookie's value - it might have path, expiry time, etc here
           cookies[i] = cookies[i].split( ';' )[0];  
        };

        url = "http://myUrl/Calendar.aspx";
        options = {
            "method": "get",
            // Set the cookies so that we appear logged-in
            "headers": {
               "Cookie": cookies.join(';') 
            }
        }
      ...

这篇关于为什么我的urlFetchApp函数无法成功登录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆