Python的请求如何处理标头中的多个Cookie [英] How does Python's Requests treat multiple cookies in a header
问题描述
我使用Python Rquests提取响应的完整标头.
I use Python Rquests to extract full headers of responses.
我想准确计算响应中有多少对Cookie(即nam/variable).有两个问题:
I want to accurately count how many cookies (i.e. nam/variable) pairs in a response. There are two issues:
1)如果服务器响应了多个Set-Cookie标头.请求如何表示呢?是否将两个Set-Cookie值合并为一个?还是保留原样?
1) If a server responded with multiple Set-Cookie headers. How does Requests represent this? Does it combine both Set-Cookie values in one? Or leave it as is?
这是我的打印标题(完整标题)的脚本:
Here is my script to print headers (full header):
import requests
requests.packages.urllib3.disable_warnings() # to disable certificate warnings
response = requests.get("https://example.com",verify=False,timeout=3)
print(str(response.headers))
response_headers = response.headers.get('Set-Cookie')
但是当我查看一些 Set-Cookie
响应标头时,我发现一些名称/值对用逗号分隔,如下所示:
But when I look at some Set-Cookie
response headers I found some name/value pairs are separated by comma like this:
dnn_IsMobile=False; path=/; secure; HttpOnly, Analytics_VisitorId=aa; expires=Mon 19-Aug-2019 14:20:02 GMT; path=/; secure; HttpOnly, Analytics=SessionId=vv&ContentItemId=-1; expires=Sat 20-Jul-2019 15:20:02 GMT; path=/; secure
2)这是否意味着服务器发送了多个 Set-Cookie
,并且请求将它们组合在一起?
2) Does this mean the server sent multiple Set-Cookie
and Requests combined them?
如果请求在cookie的名称/值对之间添加逗号,是否总是用逗号和空格分隔它们?即 cookie1 = value,cookie2 = value
,而不仅仅是像 cookie1 = value,cookie2 = value
这样的逗号.
If requests adds the comma between the name/value pairs of the cookies, does it always separate them with a comma followed by a space? i.e. cookie1=value, cookie2=value
and not just a comma like cookie1=value,cookie2=value
.
了解这种差异对我来说非常重要,以便能够计算出正确的Cookie数量.
Understanding this difference is very important to me to be able to count the right number of cookies received.
推荐答案
如何计算和获取cookie的数量
您可以使用更高级别的 .cookies
来获取它们,而不是使用 .headers
.
How to count the number of cookies and fetching them
You can use the higher level .cookies
to get them, instead of using .headers
.
例如:
>>> url="https://github.com"
>>> r = requests.get(url)
>>> r.cookies
<RequestsCookieJar[Cookie(version=0, name='_octo', value='GH1.1.1081626831.1563694143', port=None, port_specified=False, domain='.github.com', domain_specified=True, domain_initial_dot=True, path='/', path_specified=True, secure=False, expires=1626852543, discard=False, comment=None, comment_url=None, rest={}, rfc2109=False), Cookie(version=0, name='logged_in', value='no', port=None, port_specified=False, domain='.github.com', domain_specified=True, domain_initial_dot=True, path='/', path_specified=True, secure=True, expires=2194846143, discard=False, comment=None, comment_url=None, rest={'HttpOnly': None}, rfc2109=False), Cookie(version=0, name='_gh_sess', value='N0NVdFd3dTMzcm9GSkh1U21ZQkVaYWUvWnBnRmVic0VFWm9kSVZKVVhMV0hVdUw4cDh5cGpmTmIrQ0xJYU9tNHE0ZHQxVkZlUU9JRGJHUkJtc21yVGM0Mk9hQjBUYnhDVXJYSFVWSjNzT2ZpNjdEVzF0emZydkJmQmgvZmVRRFhEaE1CRTlnd0ZPY0RRY0Z4L1ByaFFpbWhVTGtPZTZmUHhONzBxclIrWWZSdFlZK09NN1QzS1dlL3cwWmVSdG5wTHFROTh1Zmh6Y3JkMjFDQmtxb2FHQT09LS1DUEd6UHFtWS9ubTdpOEdwYndzU3l3PT0%3D--2f3ae9c74cba34f2e8de6dfe55c3616e8a35ab20', port=None, port_specified=False, domain='github.com', domain_specified=False, domain_initial_dot=False, path='/', path_specified=True, secure=True, expires=None, discard=True, comment=None, comment_url=None, rest={'HttpOnly': None}, rfc2109=False), Cookie(version=0, name='has_recent_activity', value='1', port=None, port_specified=False, domain='github.com', domain_specified=False, domain_initial_dot=False, path='/', path_specified=True, secure=False, expires=1563697743, discard=False, comment=None, comment_url=None, rest={}, rfc2109=False)]>
>>> len(r.cookies)
4
>>> r.cookies.keys()
['_octo', 'logged_in', '_gh_sess', 'has_recent_activity']
>>> for key in r.cookies.iterkeys(): print("{}: {}".format(key, r.cookies[key]))
...
_octo: GH1.1.1081626831.1563694143
logged_in: no
_gh_sess: N0NVdFd3dTMzcm9GSkh1U21ZQkVaYWUvWnBnRmVic0VFWm9kSVZKVVhMV0hVdUw4cDh5cGpmTmIrQ0xJYU9tNHE0ZHQxVkZlUU9JRGJHUkJtc21yVGM0Mk9hQjBUYnhDVXJYSFVWSjNzT2ZpNjdEVzF0emZydkJmQmgvZmVRRFhEaE1CRTlnd0ZPY0RRY0Z4L1ByaFFpbWhVTGtPZTZmUHhONzBxclIrWWZSdFlZK09NN1QzS1dlL3cwWmVSdG5wTHFROTh1Zmh6Y3JkMjFDQmtxb2FHQT09LS1DUEd6UHFtWS9ubTdpOEdwYndzU3l3PT0%3D--2f3ae9c74cba34f2e8de6dfe55c3616e8a35ab20
has_recent_activity: 1
P.S.有时,阅读源代码更容易,我发现通过阅读 Cookie可以发现.py :)
P.S. Sometimes it's easier to read the source code, I found that by reading cookies.py :)
在 r.headers.get("Set-Cookie")
," 还是,"
)>:
-
Requests
在后台使用了urllib3
,您会发现r.raw
是urllib3.response.HTTPResponse的对象
. -
在urllib3中,标头由
_collections.py
中定义的HTTPHeaderDict
表示,并且多个值由,"
在那里.
Requests
usesurllib3
under the hood, you will find thatr.raw
is an object ofurllib3.response.HTTPResponse
.In urllib3, headers are represented by
HTTPHeaderDict
defined in_collections.py
, and multiple values are joined by", "
there.
def __getitem__(self, key):
val = self._container[key.lower()]
return ", ".join(val[1:])
因此,您可以使用,"
来计算cookie的数量.
So, you can use ", "
to count the number of cookies.
恐怕答案是肯定的,例如通过检查其值(一些不相关的标头会被删除以便更好地阅读):
I'm afraid the answer is yes, as by examining its value (some unrelevant headers are removed for better reading):
>>> r.headers
{
'Date': 'Sun, 21 Jul 2019 07:29:03 GMT',
'Content-Type': 'text/html; charset=utf-8',
'Transfer-Encoding': 'chunked',
'Server': 'GitHub.com',
'Status': '200 OK',
'Set-Cookie': 'has_recent_activity=1; path=/; expires=Sun, 21 Jul 2019 08:29:03 -0000, _octo=GH1.1.1081626831.1563694143; domain=.github.com; path=/; expires=Wed, 21 Jul 2021 07:29:03 -0000, logged_in=no; domain=.github.com; path=/; expires=Thu, 21 Jul 2039 07:29:03 -0000; secure; HttpOnly, _gh_sess=N0NVdFd3dTMzcm9GSkh1U21ZQkVaYWUvWnBnRmVic0VFWm9kSVZKVVhMV0hVdUw4cDh5cGpmTmIrQ0xJYU9tNHE0ZHQxVkZlUU9JRGJHUkJtc21yVGM0Mk9hQjBUYnhDVXJYSFVWSjNzT2ZpNjdEVzF0emZydkJmQmgvZmVRRFhEaE1CRTlnd0ZPY0RRY0Z4L1ByaFFpbWhVTGtPZTZmUHhONzBxclIrWWZSdFlZK09NN1QzS1dlL3cwWmVSdG5wTHFROTh1Zmh6Y3JkMjFDQmtxb2FHQT09LS1DUEd6UHFtWS9ubTdpOEdwYndzU3l3PT0%3D--2f3ae9c74cba34f2e8de6dfe55c3616e8a35ab20; path=/; secure; HttpOnly',
'Content-Encoding': 'gzip',
'X-GitHub-Request-Id': 'A947:3711:E0377A:13B4CEA:5D34143E'
}
这篇关于Python的请求如何处理标头中的多个Cookie的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!