HTTP 标头 - 请求 - Python [英] HTTP headers - Requests - Python

查看:70
本文介绍了HTTP 标头 - 请求 - Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试抓取一个网站,其中请求标头具有一些新的(对我而言)属性,例如 :authority、:method、:path、:scheme.

I am trying to scrape a website in which the request headers are having some new (for me) attributes such as :authority, :method, :path, :scheme.

{':authority':'xxxx',':method':'GET',':path':'/xxxx',':scheme':'https','accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8','accept-encoding':'gzip, deflate, br','accept-language':'en-US,en;q=0.9','cache-control':'max-age=0',GOOGLE_ABUSE_EXEMPTION=ID=0d5af55f1ada3f1e:TM=1533116294:C=r:IP=182.71.238.62-:S=APGng0u2o9IqL5wljH2o67S5Hp3hNcYIpw;1P_JAR=2018-8-1-9',   'upgrade-insecure-requests': '1',   'user-agent': 'Mozilla/5.0(WindowsNT6.1;Win64;x64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/68.0.3440.84Safari/537.36',   'x-client-data': 'CJG2yQEIpbbJAQjEtskBCKmdygEI2J3KAQioo8oBCIKkygE=' }

我尝试将它们作为带有 h​​ttp 请求的标头传递,但最终出现如下所示的错误.

I tried passing them as headers with http request but ended up with error as shown below.

ValueError: Invalid header name b':scheme'

ValueError: Invalid header name b':scheme'

对于在传递请求中使用它们的理解和指导,我们将不胜感激.

Any help would be appreciated on understanding and guidance on using them in passing request.

添加代码

import requests

url = 'https://www.google.co.in/search?q=some+text'

headers = {':authority':'xxxx',':method':'GET',':path':'/xxxx',':scheme':'https','accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8','accept-encoding':'gzip, deflate, br','accept-language':'en-US,en;q=0.9','cache-control':'max-age=0','upgrade-insecure-requests': '1',   'user-agent': 'Mozilla/5.0(WindowsNT6.1;Win64;x64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/68.0.3440.84Safari/537.36',   'x-client-data': 'CJG2yQEIpbbJAQjEtskBCKmdygEI2J3KAQioo8oBCIKkygE=' }

response = requests.get(url, headers=headers)

print(response.text)

推荐答案

你的错误来自 这里(python的源代码)

Your error comes from here (python's source code)

Http 标头不能以分号开头作为 RFC 状态.

Http headers cannot start with a semicolon as RFC states.

这篇关于HTTP 标头 - 请求 - Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆