使用请求登录后获取受限页面,urllib2 python [英] get restricted page after login using requests,urllib2 python

查看:40
本文介绍了使用请求登录后获取受限页面,urllib2 python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 python 请求登录 this 页面

I'm trying to login in this page using python-requests

headers = {
    'content-type': 'application/x-www-form-urlencoded',
    'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/33.0.1750.152 Chrome/33.0.1750.152 Safari/537.36'
}

data = {
    'username':myusername,
    'password':mypassword,
}
r = requests.post(url,data=data,headers=headers)

我尝试通过 print r 打印返回的响应并且输出是 <Response [200]> 但 html 页面是登录页面,但我期待登录后我们将被重定向到的其他页面的 html.

I tried to print the returned response via print r and output was <Response [200]> but html page was of login page but I was expecting html of someother page we will be redirected to after login.

推荐答案

登录表单包含几个隐藏字段:

The login form contains a several hidden fields:

<input type="hidden" name="lt" value="LT-1314930-GPfgUfyUj5eRY4RCaoa1Xi3gi5Jfsf" />
<input type="hidden" name="execution" value="e3s1" />
<input type="hidden" name="_eventId" value="submit" /> 

很可能是第一个字段,也许第二个字段是自动生成的并与会话相关联.您需要首先加载登录页面(使用会话),解析这些字段并将它们包含在您的 POST 中.

Most likely the first, and perhaps the second field are auto-generated and tied to the session. You'll need to load the login page first (using a session), parse those fields and include them in your POST.

您收到 200 个响应的原因是该站点将未经授权的请求重定向回登录页面;检查 r.history,该列表中将有一个或多个 302 响应.

The reason you get 200 responses is that the site redirects unauthorized requests back to the login page; check r.history, there will be one or more 302 responses in that list.

您可以使用 BeautifulSoup 来解析它,或者使用 robobrowser,它结合了 requests 和 BeautifulSoup,以及一个专门的表单处理程序,以创建一个类似浏览器的网站导航框架:

You could use BeautifulSoup to parse this, or use robobrowser, which combines requests and BeautifulSoup, together with a dedicated form handler to make a browser-like framework for navigating a website:

from robobrowser import RoboBrowser

browser = RoboBrowser(history=True,
    user_agent='Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/33.0.1750.152 Chrome/33.0.1750.152 Safari/537.36')
browser.open('http://selleraccounts.snapdeal.com/')

form = browser.get_form(id='fm1')
form['username'].value = myusername
form['password'].value = mypassword
browser.submit_form(form)

这篇关于使用请求登录后获取受限页面,urllib2 python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆