使用机械化登录网页 [英] Using mechanize to login to a webpage
问题描述
这是我第一次使用Python编程的经验,我正尝试登录此一个>
网页.搜寻后,我发现许多人建议使用 到目前为止,我已经使用发现的不同示例获得了这段代码: 通过查看网页的来源,我相信用户名/密码是表单的正确名称.当我在 问题是Mechanize尊重robots.txt 您必须将其关闭. 解决方案: 网站似乎正在使用某种其他POST值
通过Javascript生成的重新创建自己可能很痛苦,请查看页面的源代码以了解发生了什么.
实际发送的POST值: This is my first experience in programming with Python and I'm trying to log in to this
webpage. After searching around I found that many people suggested using I have this code so far using different examples I've found: From looking at the source of the webpage I believe the userid/password are the correct names for the form. When I run the script in The problem is that Mechanize is respecting the robots.txt You must turn it off. Solution: Edit: it appears that the site is using some sort of additional POST values
that are generated via Javascript. This maybe a pain to recreate yourself, check the source of the page to see what's going on.
Actual POST values being sent:
这篇关于使用机械化登录网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!mechanize
.只是为了确保在我编写代码之前我已经正确设置了一切,我才从网站下载了mechanize
zip并将python脚本保存在未压缩的机械文件夹中.
import mechanize
theurl = 'http://voyager.umeres.maine.edu/Login'
mech = mechanize.Browser()
mech.open(theurl)
mech.select_form(nr=0)
mech["userid"] = "MYUSERNAME"
mech["password"] = "MYPASSWORD"
results = mech.submit().read()
f = file('test.html', 'w')
f.write(results)
f.close()
IDLE
中运行脚本时,会遇到很多错误,包括超时错误和机械手错误.完整的回溯:
即使代码有效,我也不确定我应该期待什么.登录名是我的学校电子邮件,其中也包含班级文件夹.我要完成的最终工作是,登录帐户后,我想解析一些文件夹以获取信息并将其存储在文件中,然后可以将其转换为json或RSS feed,但这要远得多对Python有更好的了解的道路,只是试图对我想要完成的事情给出更清晰的想法.mech = mechanize.Browser()
// needs to be set before you call open
mech.set_handle_robots(False)
challenge [a14b1f67-11edcc01]
charset UTF-8
login Login
origurl /Login/
password
savedpw 0
sha1 3f77d1e8c2ab0470ef8005a85f5f9c0d7aeedba6
userid sdsads
mechanize
. Just to be sure that I setup things correctly before I get to code I downloaded the mechanize
zip from the website and had my python script in the unzipped mechanize folder.import mechanize
theurl = 'http://voyager.umeres.maine.edu/Login'
mech = mechanize.Browser()
mech.open(theurl)
mech.select_form(nr=0)
mech["userid"] = "MYUSERNAME"
mech["password"] = "MYPASSWORD"
results = mech.submit().read()
f = file('test.html', 'w')
f.write(results)
f.close()
IDLE
I get a bunch of errors including a time out error and a robot error. The full traceback:
I'm not exactly sure what I should expect either even if the code works. The login is for my school email which has class folders as well. My end game for what i'm trying to accomplish is once I log into my account I wanted to parse some folders for information and store them in a file that can be later converted in to json or RSS feed, but this is much further down the road with a much better understanding of Python just trying to give a more clear idea of what I want to accomplish.mech = mechanize.Browser()
// needs to be set before you call open
mech.set_handle_robots(False)
challenge [a14b1f67-11edcc01]
charset UTF-8
login Login
origurl /Login/
password
savedpw 0
sha1 3f77d1e8c2ab0470ef8005a85f5f9c0d7aeedba6
userid sdsads