Python和机械化登录脚本 [英] Python and mechanize login script

查看:89
本文介绍了Python和机械化登录脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

程序员们!

我正在尝试编写脚本以使用python和mechanize模块登录我的大学的食物平衡"页面.

I am trying to write a script to login into my universities "food balance" page using python and the mechanize module...

这是我要登录的页面: http://www.wcu.edu/11407.asp 该网站具有以下登录格式:

This is the page I am trying to log into: http://www.wcu.edu/11407.asp The website has the following form to login:

<FORM method=post action=https://itapp.wcu.edu/BanAuthRedirector/Default.aspx><INPUT value=https://cf.wcu.edu/busafrs/catcard/idsearch.cfm type=hidden name=wcuirs_uri> 
<P><B>WCU ID Number<BR></B><INPUT maxLength=12 size=12 type=password name=id> </P>
<P><B>PIN<BR></B><INPUT maxLength=20 type=password name=PIN> </P>
<P></P>
<P><INPUT value="Request Access" type=submit name=submit> </P></FORM>

由此我们知道我需要填写以下字段: 1.名称= id 2.名称= PIN

From this we know that I need to fill in the following fields: 1. name=id 2. name=PIN

通过操作:action = https://itapp.wcu.edu/BanAuthRedirector/Default.aspx

With the action: action=https://itapp.wcu.edu/BanAuthRedirector/Default.aspx

这是我到目前为止编写的脚本:

This is the script I have written thus far:

#!/usr/bin/python2 -W ignore

import mechanize, cookielib
from time import sleep

url   = 'http://www.wcu.edu/11407.asp'
myId  = '11111111111'
myPin = '22222222222'

# Browser
#br = mechanize.Browser()
#br = mechanize.Browser(factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True))
br = mechanize.Browser(factory=mechanize.RobustFactory()) # Use this because of bad html tags in the html...

# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)

# Browser options
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)

# Follows refresh 0 but not hangs on refresh > 0
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)

# User-Agent (fake agent to google-chrome linux x86_64)
br.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11'),
                 ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
                 ('Accept-Encoding', 'gzip,deflate,sdch'),                  
                 ('Accept-Language', 'en-US,en;q=0.8'),                     
                 ('Accept-Charset', 'ISO-8859-1,utf-8;q=0.7,*;q=0.3')]

# The site we will navigate into
br.open(url)

# Go though all the forms (for debugging only)
for f in br.forms():
    print f


# Select the first (index two) form
br.select_form(nr=2)

# User credentials
br.form['id']  = myId
br.form['PIN'] = myPin

br.form.action = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx'

# Login
br.submit()

# Wait 10 seconds
sleep(10)

# Save to a file
f = file('mycatpage.html', 'w')
f.write(br.response().read())
f.close()

现在是问题...

出于某种奇怪的原因,我返回的页面(在mycatpage.html中)是登录页面,而不是显示我的猫现金余额"和零食数量"的预期页面...

For some odd reason the page I get back (in mycatpage.html) is the login page and not the expected page that displays my "cat cash balance" and "number of block meals" left...

有人知道为什么吗?请记住,头文件一切都正确,而id和pass并不是真正的111111111和222222222,但正确的值确实适用于网站(使用浏览器...)

Does anyone have any idea why? Keep in mind that everything is correct with the header files and while the id and pass are not really 111111111 and 222222222, the correct values do work with the website (using a browser...)

预先感谢

编辑

我尝试过的另一个脚本:

Another script I tried:

from urllib import urlopen, urlencode                                           
import urllib2                                                                  
import httplib                                                                  

url = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx'                    

myId = 'xxxxxxxx'                                                               
myPin = 'xxxxxxxx'                                                              

data = {                                                                        
            'id':myId,                                                          
            'PIN':myPin,                                                        
            'submit':'Request Access',                                          
            'wcuirs_uri':'https://cf.wcu.edu/busafrs/catcard/idsearch.cfm'      
        }                                                                       

opener = urllib2.build_opener()                                                 
opener.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11'),
                     ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
                     ('Accept-Encoding', 'gzip,deflate,sdch'),                  
                     ('Accept-Language', 'en-US,en;q=0.8'),                     
                     ('Accept-Charset', 'ISO-8859-1,utf-8;q=0.7,*;q=0.3')]      

request = urllib2.Request(url, urlencode(data))                                 
open("mycatpage.html", 'w').write(opener.open(request))

这具有相同的行为...

This has the same behavior...

推荐答案

# User credentials
br.form['id']  = myId
br.form['PIN'] = myPin

我相信这是问题所在.

尝试将其更改为

br['id'] = myId
br['PIN'] = myPin

我也很确定您不需要br.form.action = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx',因为您已经选择了表单,因此只需调用Submit即可,但是我可能是错的.

I'm also pretty sure that you don't need br.form.action = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx' because you have already selected the form so just calling submit should work, but I could be wrong.

此外,我仅使用urllib和urllib2做过类似的任务,因此,如果这不起作用,我将发布该代码.

Additionally, I have done a similar task just using urllib and urllib2, so if this doesn't work I will post that code.

这是我用于urllib和urllib2的技术:

here is the the technique that I used with urllib and urllib2:

import urllib2, urllib

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
urllib2.install_opener(opener)
encoded = urllib.urlencode({"PIN":my_pin, "id":my_id})
f = opener.open('http://www.wcu.edu/11407.asp', encoded)
data = f.read()
f.close()

>>> b = mechanize.Browser(factory=mechanize.RobustFactory())
>>> b.open('http://www.wcu.edu/11407.asp')
<response_seek_wrapper at 0x10acfa248 whose wrapped object = <closeable_response at 0x10aca32d8 whose fp = <socket._fileobject object at 0x10aaf45d0>>>
>>> b.select_form(nr=2)
>>> b.form
<mechanize._form.HTMLForm instance at 0x10ad0dbd8>
>>> b.form.attrs
{'action': 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx', 'method': 'post'}

这可能是您的问题?不确定.

This could be your problem? Not sure.

修改3:

使用html检查器,我认为您很有可能需要将'wcuirs_uir'设置为'https://cf.wcu.edu/busafrs/catcard/idsearch.cfm'.我有95%的肯定能行得通.

Used an html inspector, I think there's a decent chance you need to set 'wcuirs_uir' to 'https://cf.wcu.edu/busafrs/catcard/idsearch.cfm'. I'm 95% sure that will work.

这篇关于Python和机械化登录脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆