如何使用urllib发送cookie [英] How to send cookies with urllib
问题描述
我正试图连接到一个要求您具有特定cookie才能访问它的网站。为了这个问题,我们将cookie称为 required_cookie,并将值称为 required_value。
I'm attempting to connect to a website that requires you to have a specific cookie to access it. For the sake of this question, we'll call the cookie 'required_cookie' and the value 'required_value'.
这是我的代码:
import urllib
import http.cookiejar
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
opener.addheaders = [('required_cookie', 'required_value'), ('User-Agent', 'Mozilla/5.0')]
urllib.request.install_opener(opener)
req = Request('https://www.thewebsite.com/')
webpage = urlopen(req).read()
print(webpage)
我是urllib的新手因此,请以初学者的身份回答我
I'm new to urllib so please answer me as a beginner
推荐答案
要使用 urllib
做到这一点,则需要:
To do this with urllib
, you need to:
- 构造一个
Cookie
对象。构造函数未在文档中记录,但是如果您在交互式解释器中help(http.cookiejar.Cookie)
,则可以看到其构造函数要求所有16个值属性。请注意,文档说,不应期望http.cookiejar的用户构造自己的Cookie实例。 - 使用
cj.set_cookie(cookie)
。 - 使用
cj.add_cookie_headers(req)
。
- Construct a
Cookie
object. The constructor isn't documented in the docs, but if youhelp(http.cookiejar.Cookie)
in the interactive interpreter, you can see that its constructor demands values for all 16 attributes. Notice that the docs say, "It is not expected that users of http.cookiejar construct their own Cookie instances." - Add it to the cookiejar with
cj.set_cookie(cookie)
. - Tell the cookiejar to add the correct headers to the request with
cj.add_cookie_headers(req)
.
假设您已经正确配置了策略,就可以设置。
Assuming you've configured the policy correctly, you're set.
但这是一个巨大的痛苦。作为 urllib.request $ c $的文档c>
说:
But this is a huge pain. As the docs for urllib.request
say:
另请参见 请求包建议用于更高级别的HTTP客户端界面。
See also The Requests package is recommended for a higher-level HTTP client interface.
而且,除非您有充分的理由不能安装请求
,否则您应该这样做。 urllib
在非常简单的情况下是可以容忍的,当您需要深入了解时可以很方便-但对于其他所有内容,请求
更好。
And, unless you have some good reason you can't install requests
, you really should go that way. urllib
is tolerable for really simple cases, and it can be handy when you need to get deep under the covers—but for everything else, requests
is much better.
有了个请求
,您的整个程序就变成了单行代码:
With requests
, your whole program becomes a one-liner:
webpage = requests.get('https://www.thewebsite.com/', cookies={'required_cookie': required_value}, headers={'User-Agent': 'Mozilla/5.0'}).text
…尽管它可能更容易理解为几行:
… although it's probably more readable as a few lines:
cookies = {'required_cookie': required_value}
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get('https://www.thewebsite.com/', cookies=cookies, headers=headers)
webpage = response.text
这篇关于如何使用urllib发送cookie的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!