不能使用python请求模块获取所有cookie信息 [英] not getting all cookie info using python requests module

查看:147
本文介绍了不能使用python请求模块获取所有cookie信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习如何使用python请求模块登录示例网站。这个
的代码,但结果是相同的:

 来自urllib2导入请求,build_opener,HTTPCo okieProcessor,HTTPHandler 
import cookielib

#创建一个CookieJar对象来保存cookie
cj = cookielib.CookieJar()
#创建一个打开程序以使用http协议并处理cookie。
opener = build_opener(HTTPCookieProcessor(cj),HTTPHandler())

#创建一个用于获取页面的请求对象。
req = Request( http://www.noobmovies.com/accounts/login/?next=/)
f = opener.open(req)

#see页面的前几行
html = f.read()
打印html [:50]

#签出cookie
print cookies :
用于cj中的cookie:
打印cookie

输出:

 <!DOCTYPE html> 
< html xmlns = http://www.w3.org
的cookie是:
< Cookie csrftoken = ePE8zGxV4yHJ5j1NoGbXnhLK1FQ4jwqO for www.noobmovies.com/>

所以,我如何获得所有cookie?谢谢。

解决方案

所设置的Cookie可能来自其他页面/资源,可能是由JavaScript代码加载的。您可以使用工具检查是否仅向该页面发出请求(不运行JS代码)例如 wget 卷曲 httpie



此服务器设置的唯一cookie是 csrftoken ,如您所见:

  $ wget --server-response'http://www.noobmovies.com/accounts/login/?next=/'
--2016-02-01 22: 51:55-http://www.noobmovies.com/accounts/login/?next=/
解决www.noobmovies.com(www.noobmovies.com)。 .. 69.164.217.90
连接到www.noobmovies.com(www.noobmovies.com)| 69.164.217.90 |:80 ...已连接。
HTTP请求已发送,正在等待响应...
HTTP / 1.1 200 OK
服务器:nginx / 1.4.6(Ubuntu)
日期:2016年2月2日,星期二00:51 :58 GMT
内容类型:text / html; charset = utf-8
传输编码:分块
连接:keep-alive
变化:接受编码
过期:2016年2月2日,星期二00:51:58 GMT
变量:Cookie,接受编码
缓存控制:max-age = 0
Set-Cookie:csrftoken = XJ07sWhMpT1hqv4K96lXkyDWAYIFt1W5; expires =星期二,2017年1月31日00:51:58 GMT;最大年龄= 31449600;路径= /
上次修改时间:2016年2月2日,星期二00:51:58 GMT
长度:未指定[text / html]
保存到:'index.html?next =%2F '

index.html?next =%2F [< => ] 10,83K 2,93KB / s in 3,7s

2016-02-01 22:52:03(2,93 KB / s)-'index.html?next =%2F'保存的[11085]

请注意 Set-Cookie 行。


I'm learning how to login to an example website using python requests module. This Video Tutorial got me started. From all the cookies that I see in GoogleChrome>Inspect Element>NetworkTab, I'm not able to retrieve all of them using the following code:

import requests
with requests.Session() as s:
    url = 'http://www.noobmovies.com/accounts/login/?next=/'
    s.get(url)
    allcookies = s.cookies.get_dict()
    print allcookies

Using this I only get csrftoken like below:

{'csrftoken': 'ePE8zGxV4yHJ5j1NoGbXnhLK1FQ4jwqO'}

But in google chrome, I see all these other cookies apart from csrftoken (sessionid, _gat, _ga etc):

I even tried the following code from here, but the result was the same:

from urllib2 import Request, build_opener, HTTPCookieProcessor, HTTPHandler
import cookielib

#Create a CookieJar object to hold the cookies
cj = cookielib.CookieJar()
#Create an opener to open pages using the http protocol and to process cookies.
opener = build_opener(HTTPCookieProcessor(cj), HTTPHandler())

#create a request object to be used to get the page.
req = Request("http://www.noobmovies.com/accounts/login/?next=/")
f = opener.open(req)

#see the first few lines of the page
html = f.read()
print html[:50]

#Check out the cookies
print "the cookies are: "
for cookie in cj:
    print cookie

Output:

<!DOCTYPE html>
<html xmlns="http://www.w3.org
the cookies are: 
<Cookie csrftoken=ePE8zGxV4yHJ5j1NoGbXnhLK1FQ4jwqO for www.noobmovies.com/>

So, how can I get all the cookies ? Thanks.

解决方案

The cookies being set are from other pages/resources, probably loaded by JavaScript code. You can check it making the request to the page only (without running the JS code), using tools such as wget, curl or httpie.

The only cookie this server set is csrftoken, as you can see in:

$ wget --server-response 'http://www.noobmovies.com/accounts/login/?next=/'
--2016-02-01 22:51:55--  http://www.noobmovies.com/accounts/login/?next=/
Resolving www.noobmovies.com (www.noobmovies.com)... 69.164.217.90
Connecting to www.noobmovies.com (www.noobmovies.com)|69.164.217.90|:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Server: nginx/1.4.6 (Ubuntu)
  Date: Tue, 02 Feb 2016 00:51:58 GMT
  Content-Type: text/html; charset=utf-8
  Transfer-Encoding: chunked
  Connection: keep-alive
  Vary: Accept-Encoding
  Expires: Tue, 02 Feb 2016 00:51:58 GMT
  Vary: Cookie,Accept-Encoding
  Cache-Control: max-age=0
  Set-Cookie: csrftoken=XJ07sWhMpT1hqv4K96lXkyDWAYIFt1W5; expires=Tue, 31-Jan-2017 00:51:58 GMT; Max-Age=31449600; Path=/
  Last-Modified: Tue, 02 Feb 2016 00:51:58 GMT
Length: unspecified [text/html]
Saving to: ‘index.html?next=%2F’

index.html?next=%2F             [      <=>                                   ]  10,83K  2,93KB/s    in 3,7s    

2016-02-01 22:52:03 (2,93 KB/s) - ‘index.html?next=%2F’ saved [11085]

Note the Set-Cookie line.

这篇关于不能使用python请求模块获取所有cookie信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆