python请求带有标头和参数的POST [英] python requests POST with header and parameters
问题描述
我有一个发布请求,我正在尝试使用python中的requests
发送.但是我收到无效的403错误.这些请求可以通过浏览器正常运行.
I have a post request which I am trying to send using requests
in python. But I get an invalid 403 error. The requests works fine through the browser.
POST /ajax-load-system HTTP/1.1
Host: xyz.website.com
Accept: application/json, text/javascript, */*; q=0.01
Accept-Language: en-GB,en;q=0.5
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0
Referer: http://xyz.website.com/help-me/ZYc5Yn
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Content-Length: 56
Cookie: csrf_cookie_name=a3f8adecbf11e29c006d9817be96e8d4; ci_session=ba92hlh6o0ns7f20t4bsgjt0uqfdmdtl; _ga=GA1.2.1535910352.1530452604; _gid=GA1.2.1416631165.1530452604; _gat_gtag_UA_21820217_30=1
Connection: close
csrf_test_name=a3f8adecbf11e29c006d9817be96e8d4&vID=9999
我在python中尝试的是:
What I am trying in python is:
import requests
import json
url = 'http://xyz.website.com/ajax-load-system'
payload = {
'Host': 'xyz.website.com',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0',
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Accept-Language': 'en-GB,en;q=0.5',
'Referer': 'http://xyz.website.com/help-me/ZYc5Yn',
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'X-Requested-With': 'XMLHttpRequest',
'Content-Length': '56',
'Cookie': 'csrf_cookie_name=a3f8adecbf11e29c006d9817be96e8d4; ci_session=ba92hlh6o0ns7f20t4bsgjt0uqfdmdtl; _ga=GA1.2.1535910352.1530452604; _gid=GA1.2.1416631165.1530452604; _gat_gtag_UA_21820217_30=1',
'Connection': 'close',
'csrf_test_name': 'a3f8adecbf11e29c006d9817be96e8d4',
'vID': '9999',
}
headers = {}
r = requests.post(url, headers=headers, data=json.dumps(payload))
print(r.status_code)
但这正在打印403
错误代码.我在这里做错了什么?
But this is printing a 403
error code. What am I doing wrong here?
我期望返回响应为json:
I am expecting a return response as json:
{"status_message":"Thanks for help.","help_count":"141","status":true}
推荐答案
您正在混淆标头和有效负载,有效负载是未JSON编码.
You are confusing headers and payload, an the payload is not JSON encoded.
这些都是标题:
Host: xyz.website.com
Accept: application/json, text/javascript, */*; q=0.01
Accept-Language: en-GB,en;q=0.5
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0
Referer: http://xyz.website.com/help-me/ZYc5Yn
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Content-Length: 56
Cookie: csrf_cookie_name=a3f8adecbf11e29c006d9817be96e8d4; ci_session=ba92hlh6o0ns7f20t4bsgjt0uqfdmdtl; _ga=GA1.2.1535910352.1530452604; _gid=GA1.2.1416631165.1530452604; _gat_gtag_UA_21820217_30=1
Connection: close
其中大多数是自动化的,不需要手动设置. requests
将根据URL为您设置Host
,Accept
设置为可接受的默认值,在这种情况下很少需要Accept-Language
,除非使用HTTPS,否则Referer
甚至通常不会被设置或过滤出于隐私原因,因此网站不再依赖于它的设置,Content-Type
实际上必须反映您的POST
的内容(而不是JSON!),因此requests
根据您的调用方式为您设置此内容,Content-Length
必须反映实际的内容长度,因此由requests
设置,因为它处于计算该内容的最佳位置,并且Connection
绝对应该由库处理,因为您不想阻止它有效地重用连接.
Most of these are automated and don't need to be set manually. requests
will set Host
for you based on the URL, Accept
is set to an acceptable default, Accept-Language
is rarely needed in these situations, Referer
, unless using HTTPS, is often not even set or filtered out for privacy reasons, so sites no longer rely on it being set, Content-Type
must actually reflect the contents of your POST
(and is not JSON!), so requests
sets this for you depending on how you call it, Content-Length
must reflect the actual content length, so is set by requests
as it is in the best position to calculate this, and Connection
should definitely be handled by the library as you don't want to prevent it from efficiently re-using connections if it can.
充其量,您可以设置X-Requested-With
和User-Agent
,但前提是服务器不会接受该请求. Cookies
标头反映了浏览器保存的cookie的值.您的脚本可以使用从服务器获取自己的Cookie组.请求会话对象对Referer
标头(或同一站点中的其他合适的URL)中命名的url发出初始GET
请求,此时服务器应在响应中设置cookie,将会存储在会话中,以便在发布请求中重复使用.使用该机制来获取您自己的CSRF Cookie值.
At best you could set X-Requested-With
and User-Agent
, but only if the server would not otherwise accept the request. The Cookies
header reflect the values of cookies the browser holds. Your script can get their own set of cookies from the server by using a requests Session object to make an initial GET
request to the url named in the Referer
header (or other suitable URL on the same site), at which point the server should set cookies on the response, and those would be stored in the session for reuse on the post request. Use that mechanism to get your own CSRF cookie value.
注意Content-Type
标头:
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
当您将字典传递给requests.post()
函数的data
关键字时,该库将为您将数据编码为与该内容类型完全相同的内容.
When you pass in a dictionary to the data
keyword of the requests.post()
function, the library will encode the data to exactly that content type for you.
实际有效载荷是
csrf_test_name=a3f8adecbf11e29c006d9817be96e8d4&vID=9999
这是两个字段,分别是csrf_test_name
和vID
,它们是您的payload
词典的一部分.
These are two fields, csrf_test_name
, and vID
, that need to part of your payload
dictionary.
请注意,csrf_test_name
值与cookie中的csrf_cookie_name
值匹配..网站就是这样来保护自己免受跨站点伪造攻击 ,第三方可能会尝试代表您发布到相同的网址.这样的第三方将无法访问相同的cookie,因此将被阻止. 您的代码需要获取一个新的cookie ;正确的CSRF实施会限制任何CSRF cookie可以重复使用的时间.
Note that the csrf_test_name
value matches the csrf_cookie_name
value in the cookies. This is how the site protects itself from Cross-site forgery attacks, where a third party may try to post to the same URL on your behalf. Such a third party would not have access to the same cookies so would be prevented. Your code needs to obtain a new cookie; a proper CSRF implementation would limit the time any CSRF cookie can be re-used.
因此至少需要 才能使其全部正常工作是:
So what would at least be needed to make it all work, is:
# *optional*, the site may not care about these. If they *do* care, then
# they care about keeping out automated scripts and could in future
# raise the stakes and require more 'browser-like' markers. Ask yourself
# if you want to anger the site owners and get into an arms race.
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0',
'X-Requested-With': 'XMLHttpRequest',
}
payload = {
'vID': 9999,
}
url = 'http://xyz.website.com/ajax-load-system'
# the URL from the Referer header, but others at the site would probably
# also work
initial_url = 'http://xyz.website.com/help-me/ZYc5Yn'
with requests.Session() as session:
# obtain CSRF cookie
initial_response = session.get(initial_url)
payload['csrf_test_name'] = session.cookies['csrf_cookie_name']
# Now actually post with the correct CSRF cookie
response = session.post(url, headers=headers, data=payload)
如果这仍然引起问题,则需要尝试另外两个标头Accept
和Accept-Language
.考虑到这一点,这意味着该站点已经对如何保持自动刮板机 进行了长时间的认真思考.考虑与他们联系并询问他们是否提供API选项.
If this still causes issues, you'll need to try out two additional headers, , Accept
and Accept-Language
. Take into account this will mean that the site has already thought long and hard about how to keep automated site scrapers out. Consider contacting them and asking if they offer an API option instead.
这篇关于python请求带有标头和参数的POST的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!