带有request.get()的python中的XHR请求无响应 [英] No response for XHR request in python with requests.get()
问题描述
我想从服务器中抓取德国民意调查数据.在这里,我搜索了一条名为"Judengasse"的示例街.
I want to scrape german poll data from a server. Here, I search for an examplary street, straße "Judengasse".
I have been trying to reproduce this. Unfortunately, the link from the reference is not intact anymore, so I couldn't directly compare it to my problem. Since I am fairly inexperienced, I do not know what is exactly needed to reproduce the request that is submitted via the web interface.
我现在不需要我的请求才能使用标头的哪些属性,并且哪些可能是多余的.在Chrome的检查模式下,我发现我的标头属性比参考示例中的要多.
I don't now which attributes of the header are needed for my request to work and what of it might be redundant. In Chrome's inspect mode I see that in my case there are more header attributes than in the referenced example.
到目前为止,我的代码(无法正常工作)没有尝试重现SE帖子:
My code so far (which does not work) from trying to reproduce the SE post:
import requests
url = 'https://online-service2.nuernberg.de/Finder/action/getItems'
data = {
"finder":"Wahlraumfinder",
"strasse":"Judengasse",
"hausnummer":"0"
}
headers = {
'Host': 'online-service2.nuernberg.de',
'Referer': 'https://online-service2.nuernberg.de/Finder/?Wahlraumfinder',
'Accept': '*/*',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7',
'Connection': 'keep-alive',
'Content-Length': '312',
'Content-Type': 'multipart/form-data; boundary=----WebKitFormBoundaryeJZfrnZATOw6B5By',
'DNT': '1',
'Host': 'online-service2.nuernberg.de',
'Referer': 'https://online-service2.nuernberg.de/Finder/?Wahlraumfinder',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-origin',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36',
'X-Requested-With': 'XMLHttpRequest'
}
response = requests.get(url, data=data, headers=headers)
我没有得到回应.我将所有请求标头添加到 headers
.
I don't get a respone. I added all request headers to headers
.
不确定,是否需要更多标头.
Not sure, if more headers are needed.
此外,我不确定 url
是否正确.
Further, I am not sure if the url
is correct.
我希望针对此特定请求"Judengasse"生成以下形式的输出:
I am looking to generate output of the following form, for this specific request "Judengasse":
Nr 0652
Wahllokal Willstätt.-Gym., Innerer Laufer Platz 11
这相当于将"Judengasse"放入搜索栏,然后继续搜索搜索"并提取第一个输出框"Wahl-/Stimmbezirk"的一部分
This corresponds to putting in "Judengasse" into the search bar and hitting go on the search "Suche" and extracting parts of the first output box "Wahl-/Stimmbezirk"
当我在Chrome的开发人员模式下查看XHR时:
When I look at the XHR in Chrome's dev mode:
常规
Request URL: https://online-service2.nuernberg.de/Finder/action/getItems
Request Method: POST
Status Code: 200 OK
Remote Address: 193.22.166.102:443
Referrer Policy: no-referrer-when-downgrade
响应标题
Connection: Keep-Alive
Content-Length: 1149
Content-Type: application/json;charset=UTF-8
Date: Wed, 04 Dec 2019 00:21:30 GMT
Keep-Alive: timeout=5, max=100
Server: Apache
请求标头
Accept: */*
Accept-Encoding: gzip, deflate, br
Accept-Language: de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
Connection: keep-alive
Content-Length: 312
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryx2jHYJHo3ejnKw0l
DNT: 1
Host: online-service2.nuernberg.de
Origin: https://online-service2.nuernberg.de
Referer: https://online-service2.nuernberg.de/Finder/?Wahlraumfinder
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-origin
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36
X-Requested-With: XMLHttpRequest
来自数据
------WebKitFormBoundaryx2jHYJHo3ejnKw0l
Content-Disposition: form-data; name="action"
"action/getItems"
------WebKitFormBoundaryx2jHYJHo3ejnKw0l
Content-Disposition: form-data; name="data"
{"finder":"Wahlraumfinder","strasse":"Judengasse","hausnummer":"0"}
------WebKitFormBoundaryx2jHYJHo3ejnKw0l--
感谢您的阅读.
推荐答案
经过一番研究,我终于设法从该服务器获得了200条响应.
After some research I finally managed to get a 200 response from this server.
首先,在这种情况下, requests.get
应该替换为 requests.post
,因为您要根据从中获得的信息来复制HTTP POST请求.Chrome的开发人员模式的常规"部分.
Firstly, requests.get
in this case should be replace by requests.post
, since you want to replicate an HTTP POST request, according to the info you got from Chrome's dev mode, "General" section.
第二,从标题中我们可以看到,数据以"multipart/form-data"类型的请求发送.据我了解,这是一种用于发送文件而不是常规数据的请求(有关此请求的更多信息
Secondly, from the headers we can see that the data is sent as being of type "multipart/form-data" request. As far as I could understand, this is a type of request that is used to send files instead of regular data (more about this type of request here).
因此,我将通过POST请求发送的字符串转换为二进制(这是通过在 b
前面添加)来实现的,并将其传递给请求的 files
参数.出于某种原因,此参数在集合 {c}
中需要一个元组(a,b)
,因此 {(None,data)}
.
So, I converted the string sent through the POST request to binary (this is achieved by prepending b
) and passed it to the files
parameter of the request. For some reason, this parameter requires a tuple (a, b)
inside a set {c}
, hence the {(None, data)}
.
我还将街道名称作为参数传递给 data
,因此更易于操作.
I also passed the street name as a parameter to data
, so it's easier to manipulate it.
我得到了这个工作代码(我正在使用浏览器的请求):
I got this working code (I'm using my browser's request):
import requests
url = 'https://online-service2.nuernberg.de/Finder/action/getItems'
street = b'Judengasse'
data = b'-----------------------------15242581323522\r\n' \
b'Content-Disposition: form-data; name=\"action\"\r\n\r\n' \
b'\"action/getItems\"\r\n-----------------------------15242581323522\r\n' \
b'Content-Disposition: form-data; name="data"\r\n\r\n' \
b'{\"finder\":\"Wahlraumfinder\",\"strasse\":\"%s\",\"hausnummer\":\"0\"}\r\n' \
b'-----------------------------15242581323522--' % street
headers = {"Host": "online-service2.nuernberg.de",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0",
"Accept": "*/*",
"Accept-Language": "en-US,en;q=0.5",
"Accept-Encoding": "gzip, deflate, br",
"X-Requested-With": "XMLHttpRequest",
"Content-Type": "multipart/form-data; boundary=---------------------------15242581323522",
"Content-Length": "321",
"Origin": "https://online-service2.nuernberg.de",
"DNT": "1",
"Connection": "keep-alive",
"Referer": "https://online-service2.nuernberg.de/Finder/?Wahlraumfinder",
}
multipart_data = {(None, data,)}
response = requests.post(url, files=multipart_data, headers=headers)
print(response.text)
我得到了这个原始答复:
I got this raw response:
{"id":"8c4f7a57-1bd6-423a-8ab8-e1e40e1e3852","items":[{"zeilenbeschriftung":"Wahl-/Stimmbezirk","linkAdr":null,"mapUrl":"http://online-service.nuernberg.de/Themenstadtplan/sta_gebietsgli
ederungen.aspx?p_urlvislayer=Stimmbezirke&XKoord=4433503.05&YKoord=5480253.301&Zaehler=1&Textzusatz=Judengasse+0&z_XKoord=4433670.0&z_YKoord=5480347.0&z_Zaehler=1&z_Textzusatz=Wahllokal%
20Willst%E4tt.-Gym.%2C+Innerer+Laufer+Platz+11","items":["0652","Judengasse, Neue Gasse","Willstätt.-Gym., Innerer Laufer Platz 11","Zi. 101 ,1. OG",null]},{"zeilenbeschriftung":"Stimmkr
eis Landtagswahl","linkAdr":null,"mapUrl":"http://online-service.nuernberg.de/Themenstadtplan/sta_gebietsgliederungen.aspx?p_urlvislayer=Stimmkreis_LTW&XKoord=4433503.05&YKoord=5480253.3
01&Zaehler=1&Textzusatz=Judengasse+0&p_scale=100000","items":["501","Nürnberg-Nord"]},{"zeilenbeschriftung":"Wahlkreis Bundestagswahl","linkAdr":null,"mapUrl":"http://online-service.nuer
nberg.de/Themenstadtplan/sta_gebietsgliederungen.aspx?p_urlvislayer=Wahlkreis_BTW&XKoord=4433503.05&YKoord=5480253.301&Zaehler=1&Textzusatz=Judengasse+0&p_scale=150000","items":["244","N
ürnberg-Nord"]}],"status":200}
您可以轻松地对其进行解析以获得期望的结果:
which you can easily parse to get the result you expect:
print(response.json()["items"][0]["items"])
屈服...
['0652', 'Judengasse, Neue Gasse', 'Willstätt.-Gym., Innerer Laufer Platz 11', 'Zi. 101 ,1. OG', None]
希望有帮助.
致谢
这篇关于带有request.get()的python中的XHR请求无响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!