机械化不能在Google Appengine中自动执行gmail登录 [英] Mechanize not working for automating gmail login in Google Appengine

查看:115
本文介绍了机械化不能在Google Appengine中自动执行gmail登录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用机械化并在GAE上部署了一个应用程序,它工作正常。但是,对于我制作的应用程序,我试图通过机械化自动登录到Gmail。它不能在本地机器上的开发环境以及在appengine上部署之后工作。



我可以使用相同的脚本在我的服务器上运行它通过使用PSP的mod_python。



我在这里找到了很多解决方案,但是它们都不适合我。以下是我的代码片段:

 < snip> 
br = mechanize.Browser()
response = br.open(http://www.gmail.com)
loginForm = br.forms()。next()
loginForm [Email] = self.request.get('user')
loginForm [Passwd] = self.request.get('password')
response = br.open( loginForm.click())
response2 = br.open(http://mail.google.com/mail/h/)
result = response2.read()
<剪断>

当我查看结果时,我得到的仅仅是与appengine一起使用时的登录页面。但是,在我自己的服务器上托管mod_python时,我会在用户的收件箱中获取页面。

Google是如何瘫痪GAE上的urllib2模块的。



在内部,它现在使用urlfetch模块(这是Google编写的),他们完全删除了HTTPCookieProcessor()这意味着,cookie不会从请求请求中持久化,这是以编程方式自动登录网站时关键的一部分。

有一种解决方法,但不使用机械化。你必须推出自己的Cookie处理器 - 这是我采用的基本方法(不完美,但它完成了工作):

  import urllib,urllib2,Cookie $ b $ from google.appengine.api从urlparse导入urlfetch 
导入urljoin
导入记录

class GAEOpener(object):
def __init __(self):
self.cookie = Cookie.SimpleCookie()
self.last_response = None

def open(self,url,data = None):
base_url = url
如果数据是None:
method = urlfetch.GET
else:
method = urlfetch.POST
url不是无:
self.last_response = urlfetch.fetch(url = url,
payload = data,
method = method,
headers = self._get_headers(self.cookie),
allow_truncated = False,
follow_redirects = False,
截止日期= 10

data = None#下一个请求将是get,所以不需要再次发送数据。
method = urlfetch.GET
self.cookie.load(self.last_response.headers.get('set-cookie',''))#从响应中加载cookie
url = urljoin(base_url,self.last_response.headers.get('location'))
if url == base_url:$ b $ url url = None
return self.last_response

def _get_headers(self,cookie):
headers = {
'Host':'< ENTER HOST NAME HERE>',
'User-Agent':'Mozilla / 5.0 U; Windows NT 6.1; en-US; rv:1.9.1.2)Gecko / 20090729 Firefox / 3.5.2(.NET CLR 3.5.30729)',
'Cookie':self._make_cookie_header(cookie)

返回头部
$ b $ def _make_cookie_header(self,cookie):
cookie_header =
用于cookie.values()中的值:
cookie_header + =%s =%s;%(value.key,value.value)
返回cookie_head呃

def get_cookie_header(self):
return self._make_cookie_header(self.cookie)

您可以像使用urllib2.urlopen一样使用它,除非您使用的方法只是打开。

I have used mechanize and deployed an app on GAE and it works fine. But, for an app that I am making, I am trying to automate login to gmail through mechanize. It doesn't work in the development environment on local machine as well as after deploying on appengine.

I have been able to use the same script to run it on my server through mod_python using PSP.

I found a lot of solutions here, but none of them seem to work for me. Here is a snippet of my code:

<snip>
br = mechanize.Browser()
response = br.open("http://www.gmail.com")
loginForm = br.forms().next()
loginForm["Email"] = self.request.get('user')
loginForm["Passwd"] = self.request.get('password')
response = br.open(loginForm.click())
response2 = br.open("http://mail.google.com/mail/h/")
result = response2.read()
<snip>

When I look at the result, all I get is the login page when used with appengine. But with mod_python hosted on my own server, I get the page with the user's inbox.

解决方案

The problem is most likely due to how Google crippled the urllib2 module on GAE.

Internally it now uses the urlfetch module (which is something that Google wrote) and they have completely removed the HTTPCookieProcessor() functionality - meaning, cookies are NOT persisted from request to request which is the critical piece when automatically logging into sites programmatically.

There is a way around this, but not using mechanize. You have to roll your own Cookie processor - here is the basic approach I took (not perfect, but it gets the job done):

import urllib, urllib2, Cookie
from google.appengine.api import urlfetch
from urlparse import urljoin
import logging

class GAEOpener(object):
    def __init__(self):
        self.cookie = Cookie.SimpleCookie()
        self.last_response = None

    def open(self, url, data = None):
        base_url = url
        if data is None:
            method = urlfetch.GET
        else:
            method = urlfetch.POST
        while url is not None:
            self.last_response = urlfetch.fetch(url = url,
                payload = data,
                method = method,
                headers = self._get_headers(self.cookie),
                allow_truncated = False,
                follow_redirects = False,
                deadline = 10
                )
            data = None # Next request will be a get, so no need to send the data again. 
            method = urlfetch.GET
            self.cookie.load(self.last_response.headers.get('set-cookie', '')) # Load the cookies from the response
            url = urljoin(base_url, self.last_response.headers.get('location'))
            if url == base_url:
                url = None
        return self.last_response

    def _get_headers(self, cookie):
        headers = {
            'Host' : '<ENTER HOST NAME HERE>',
            'User-Agent' : 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)',
            'Cookie' : self._make_cookie_header(cookie)
             }
        return headers

    def _make_cookie_header(self, cookie):
        cookie_header = ""
        for value in cookie.values():
            cookie_header += "%s=%s; " % (value.key, value.value)
        return cookie_header

    def get_cookie_header(self):
        return self._make_cookie_header(self.cookie)

You can use it like you would urllib2.urlopen, except the method you would use is just "open".

这篇关于机械化不能在Google Appengine中自动执行gmail登录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆