如何使机械化不是失败,并在此页面上的形式? [英] How to make mechanize not fail with forms on this page?
问题描述
进口机械化URL ='http://steamcommunity.comBR = mechanize.Browser(工厂= mechanize.RobustFactory())br.open(URL)
打印br.request
打印br.form
对于每个在br.forms():
每个打印
打印
以上code的结果:
回溯(最后最近一次调用):
文件./mech_test.py,12号线,上述<&模块GT;
对于每个在br.forms():
文件建立/ bdist.linux-的i686 /蛋/机械化/ _mechanize.py,线路426,在形式
文件建立/ bdist.linux-的i686 /蛋/机械化/ _html.py,线路559,在形式
文件建立/ bdist.linux-的i686 /蛋/机械化/ _html.py,线路228,在形式
mechanize._html.ParseError
我的具体目标是使用登录表单,但我甚至不能机械化认识到存在任何形式的。即使使用我认为是选择的最基本的方法的任何的形式, br.select_form(NR = 0)
,结果在相同的回溯。表单的ENCTYPE是的multipart / form-data的,如果有差别。
我想这一切都归结到一个问题的两个部分:我怎样才能用机械化此网页的工作,或者如果它是不可能的,什么是同时保持饼干的另一种方式。
?编辑:如下面所提到的,这个重定向到'https://steamcommunity.com'
机械化可以成功地检索HTML作为可与以下code可以看出:
URL ='https://steamcommunity.comHH = mechanize.HTTPSHandler()#你可能想HTTPSHandler,太
hh.set_http_debuglevel(1)
首战= mechanize.build_opener(HH)
响应= opener.open(URL)
内容= response.readlines()打印内容
使用这个秘密,我敢肯定,这是对你的工作;)
BR = mechanize.Browser(工厂= mechanize.DefaultFactory(i_want_broken_xhtml_support = TRUE))
import mechanize
url = 'http://steamcommunity.com'
br=mechanize.Browser(factory=mechanize.RobustFactory())
br.open(url)
print br.request
print br.form
for each in br.forms():
print each
print
The above code results in:
Traceback (most recent call last):
File "./mech_test.py", line 12, in <module>
for each in br.forms():
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 426, in forms
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 559, in forms
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 228, in forms
mechanize._html.ParseError
My specific goal is to use the login form, but I can't even get mechanize to recognize that there are any forms. Even using what I think is the most basic method of selecting any form, br.select_form(nr=0)
, results in the same traceback. The form's enctype is multipart/form-data if that makes a difference.
I guess that all boils down to a two part question: How can I get mechanize to work with this page, or if it's not possible, what's another way while maintaining cookies?
edit: As mentioned below, this redirects to 'https://steamcommunity.com'.
Mechanize can successfully retrieving the HTML as can be seen with the following code:
url = 'https://steamcommunity.com'
hh = mechanize.HTTPSHandler() # you might want HTTPSHandler, too
hh.set_http_debuglevel(1)
opener = mechanize.build_opener(hh)
response = opener.open(url)
contents = response.readlines()
print contents
Use this secret, i'm sure this is work for you ;)
br = mechanize.Browser(factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True))
这篇关于如何使机械化不是失败,并在此页面上的形式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!