如何使用urllib发送cookie [英] How to send cookies with urllib

查看:426
本文介绍了如何使用urllib发送cookie的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正试图连接到一个要求您具有特定cookie才能访问它的网站。为了这个问题,我们将cookie称为 required_cookie,并将值称为 required_value。

I'm attempting to connect to a website that requires you to have a specific cookie to access it. For the sake of this question, we'll call the cookie 'required_cookie' and the value 'required_value'.

这是我的代码:

import urllib
import http.cookiejar

cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))

opener.addheaders = [('required_cookie', 'required_value'), ('User-Agent', 'Mozilla/5.0')]

urllib.request.install_opener(opener)

req = Request('https://www.thewebsite.com/')
webpage = urlopen(req).read()
print(webpage)

我是urllib的新手因此,请以初学者的身份回答我

I'm new to urllib so please answer me as a beginner

推荐答案

要使用 urllib 做到这一点,则需要:

To do this with urllib, you need to:


  • 构造一个 Cookie 对象。构造函数未在文档中记录,但是如果您在交互式解释器中 help(http.cookiejar.Cookie),则可以看到其构造函数要求所有16个值属性。请注意,文档说,不应期望http.cookiejar的用户构造自己的Cookie实例。

  • 使用 cj.set_cookie(cookie)

  • 使用 cj.add_cookie_headers(req)

  • Construct a Cookie object. The constructor isn't documented in the docs, but if you help(http.cookiejar.Cookie) in the interactive interpreter, you can see that its constructor demands values for all 16 attributes. Notice that the docs say, "It is not expected that users of http.cookiejar construct their own Cookie instances."
  • Add it to the cookiejar with cj.set_cookie(cookie).
  • Tell the cookiejar to add the correct headers to the request with cj.add_cookie_headers(req).

假设您已经正确配置了策略,就可以设置。

Assuming you've configured the policy correctly, you're set.

但这是一个巨大的痛苦。作为 urllib.request 说:

But this is a huge pain. As the docs for urllib.request say:


另请参见 请求包建议用于更高级别的HTTP客户端界面。

See also The Requests package is recommended for a higher-level HTTP client interface.

而且,除非您有充分的理由不能安装请求,否则您应该这样做。 urllib 在非常简单的情况下是可以容忍的,当您需要深入了解时可以很方便-但对于其他所有内容,请求更好。

And, unless you have some good reason you can't install requests, you really should go that way. urllib is tolerable for really simple cases, and it can be handy when you need to get deep under the covers—but for everything else, requests is much better.

有了个请求,您的整个程序就变成了单行代码:

With requests, your whole program becomes a one-liner:

webpage = requests.get('https://www.thewebsite.com/', cookies={'required_cookie': required_value}, headers={'User-Agent': 'Mozilla/5.0'}).text

…尽管它可能更容易理解为几行:

… although it's probably more readable as a few lines:

cookies = {'required_cookie': required_value}
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get('https://www.thewebsite.com/', cookies=cookies, headers=headers)
webpage = response.text

这篇关于如何使用urllib发送cookie的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆