带有request.get()的python中的XHR请求无响应 [英] No response for XHR request in python with requests.get()

查看:58
本文介绍了带有request.get()的python中的XHR请求无响应的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从服务器中抓取德国民意调查数据.在这里,我搜索了一条名为"Judengasse"的示例街.

I want to scrape german poll data from a server. Here, I search for an examplary street, straße "Judengasse".

我一直在尝试复制

I have been trying to reproduce this. Unfortunately, the link from the reference is not intact anymore, so I couldn't directly compare it to my problem. Since I am fairly inexperienced, I do not know what is exactly needed to reproduce the request that is submitted via the web interface.

我现在不需要我的请求才能使用标头的哪些属性,并且哪些可能是多余的.在Chrome的检查模式下,我发现我的标头属性比参考示例中的要多.

I don't now which attributes of the header are needed for my request to work and what of it might be redundant. In Chrome's inspect mode I see that in my case there are more header attributes than in the referenced example.

到目前为止,我的代码(无法正常工作)没有尝试重现SE帖子:

My code so far (which does not work) from trying to reproduce the SE post:

import requests

url = 'https://online-service2.nuernberg.de/Finder/action/getItems'
data = {
    "finder":"Wahlraumfinder",
    "strasse":"Judengasse",
    "hausnummer":"0"
    }

headers = {
           'Host': 'online-service2.nuernberg.de', 
           'Referer': 'https://online-service2.nuernberg.de/Finder/?Wahlraumfinder', 
           'Accept': '*/*', 
           'Accept-Encoding': 'gzip, deflate, br', 
           'Accept-Language': 'de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7', 
           'Connection': 'keep-alive', 
           'Content-Length': '312', 
           'Content-Type': 'multipart/form-data; boundary=----WebKitFormBoundaryeJZfrnZATOw6B5By', 
           'DNT': '1', 
           'Host': 'online-service2.nuernberg.de', 
           'Referer': 'https://online-service2.nuernberg.de/Finder/?Wahlraumfinder', 
           'Sec-Fetch-Mode': 'cors', 
           'Sec-Fetch-Site': 'same-origin', 
           'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36',
           'X-Requested-With': 'XMLHttpRequest'
           }

response = requests.get(url, data=data, headers=headers)

我没有得到回应.我将所有请求标头添加到 headers .

I don't get a respone. I added all request headers to headers.

不确定,是否需要更多标头.

Not sure, if more headers are needed.

此外,我不确定 url 是否正确.

Further, I am not sure if the url is correct.

我希望针对此特定请求"Judengasse"生成以下形式的输出:

I am looking to generate output of the following form, for this specific request "Judengasse":

Nr 0652
Wahllokal Willstätt.-Gym., Innerer Laufer Platz 11

这相当于将"Judengasse"放入搜索栏,然后继续搜索搜索"并提取第一个输出框"Wahl-/Stimmbezirk"的一部分

This corresponds to putting in "Judengasse" into the search bar and hitting go on the search "Suche" and extracting parts of the first output box "Wahl-/Stimmbezirk"

当我在Chrome的开发人员模式下查看XHR时:

When I look at the XHR in Chrome's dev mode:

常规

Request URL: https://online-service2.nuernberg.de/Finder/action/getItems
Request Method: POST
Status Code: 200 OK
Remote Address: 193.22.166.102:443
Referrer Policy: no-referrer-when-downgrade

响应标题

Connection: Keep-Alive
Content-Length: 1149
Content-Type: application/json;charset=UTF-8
Date: Wed, 04 Dec 2019 00:21:30 GMT
Keep-Alive: timeout=5, max=100
Server: Apache

请求标头

Accept: */*
Accept-Encoding: gzip, deflate, br
Accept-Language: de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
Connection: keep-alive
Content-Length: 312
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryx2jHYJHo3ejnKw0l
DNT: 1
Host: online-service2.nuernberg.de
Origin: https://online-service2.nuernberg.de
Referer: https://online-service2.nuernberg.de/Finder/?Wahlraumfinder
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-origin
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36
X-Requested-With: XMLHttpRequest

来自数据

------WebKitFormBoundaryx2jHYJHo3ejnKw0l
Content-Disposition: form-data; name="action"

"action/getItems"
------WebKitFormBoundaryx2jHYJHo3ejnKw0l
Content-Disposition: form-data; name="data"

{"finder":"Wahlraumfinder","strasse":"Judengasse","hausnummer":"0"}
------WebKitFormBoundaryx2jHYJHo3ejnKw0l--

感谢您的阅读.

推荐答案

经过一番研究,我终于设法从该服务器获得了200条响应.

After some research I finally managed to get a 200 response from this server.

首先,在这种情况下, requests.get 应该替换为 requests.post ,因为您要根据从中获得的信息来复制HTTP POST请求.Chrome的开发人员模式的常规"部分.

Firstly, requests.get in this case should be replace by requests.post, since you want to replicate an HTTP POST request, according to the info you got from Chrome's dev mode, "General" section.

第二,从标题中我们可以看到,数据以"multipart/form-data"类型的请求发送.据我了解,这是一种用于发送文件而不是常规数据的请求(有关此请求的更多信息

Secondly, from the headers we can see that the data is sent as being of type "multipart/form-data" request. As far as I could understand, this is a type of request that is used to send files instead of regular data (more about this type of request here).

因此,我将通过POST请求发送的字符串转换为二进制(这是通过在 b 前面添加)来实现的,并将其传递给请求的 files 参数.出于某种原因,此参数在集合 {c} 中需要一个元组(a,b),因此 {(None,data)} .

So, I converted the string sent through the POST request to binary (this is achieved by prepending b) and passed it to the files parameter of the request. For some reason, this parameter requires a tuple (a, b) inside a set {c}, hence the {(None, data)}.

我还将街道名称作为参数传递给 data ,因此更易于操作.

I also passed the street name as a parameter to data, so it's easier to manipulate it.

我得到了这个工作代码(我正在使用浏览器的请求):

I got this working code (I'm using my browser's request):

import requests

url = 'https://online-service2.nuernberg.de/Finder/action/getItems'

street = b'Judengasse'

data = b'-----------------------------15242581323522\r\n' \
       b'Content-Disposition: form-data; name=\"action\"\r\n\r\n' \
       b'\"action/getItems\"\r\n-----------------------------15242581323522\r\n' \
       b'Content-Disposition: form-data; name="data"\r\n\r\n' \
       b'{\"finder\":\"Wahlraumfinder\",\"strasse\":\"%s\",\"hausnummer\":\"0\"}\r\n' \
       b'-----------------------------15242581323522--' % street

headers = {"Host": "online-service2.nuernberg.de",
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0",
            "Accept": "*/*",
            "Accept-Language": "en-US,en;q=0.5",
            "Accept-Encoding": "gzip, deflate, br",
            "X-Requested-With": "XMLHttpRequest",
            "Content-Type": "multipart/form-data; boundary=---------------------------15242581323522",
            "Content-Length": "321",
            "Origin": "https://online-service2.nuernberg.de",
            "DNT": "1",
            "Connection": "keep-alive",
            "Referer": "https://online-service2.nuernberg.de/Finder/?Wahlraumfinder",
           }


multipart_data = {(None, data,)}
response = requests.post(url, files=multipart_data, headers=headers)

print(response.text)

我得到了这个原始答复:

I got this raw response:

{"id":"8c4f7a57-1bd6-423a-8ab8-e1e40e1e3852","items":[{"zeilenbeschriftung":"Wahl-/Stimmbezirk","linkAdr":null,"mapUrl":"http://online-service.nuernberg.de/Themenstadtplan/sta_gebietsgli
ederungen.aspx?p_urlvislayer=Stimmbezirke&XKoord=4433503.05&YKoord=5480253.301&Zaehler=1&Textzusatz=Judengasse+0&z_XKoord=4433670.0&z_YKoord=5480347.0&z_Zaehler=1&z_Textzusatz=Wahllokal%
20Willst%E4tt.-Gym.%2C+Innerer+Laufer+Platz+11","items":["0652","Judengasse, Neue Gasse","Willstätt.-Gym., Innerer Laufer Platz 11","Zi. 101 ,1. OG",null]},{"zeilenbeschriftung":"Stimmkr
eis Landtagswahl","linkAdr":null,"mapUrl":"http://online-service.nuernberg.de/Themenstadtplan/sta_gebietsgliederungen.aspx?p_urlvislayer=Stimmkreis_LTW&XKoord=4433503.05&YKoord=5480253.3
01&Zaehler=1&Textzusatz=Judengasse+0&p_scale=100000","items":["501","Nürnberg-Nord"]},{"zeilenbeschriftung":"Wahlkreis Bundestagswahl","linkAdr":null,"mapUrl":"http://online-service.nuer
nberg.de/Themenstadtplan/sta_gebietsgliederungen.aspx?p_urlvislayer=Wahlkreis_BTW&XKoord=4433503.05&YKoord=5480253.301&Zaehler=1&Textzusatz=Judengasse+0&p_scale=150000","items":["244","N
ürnberg-Nord"]}],"status":200}

您可以轻松地对其进行解析以获得期望的结果:

which you can easily parse to get the result you expect:

print(response.json()["items"][0]["items"])

屈服...

['0652', 'Judengasse, Neue Gasse', 'Willstätt.-Gym., Innerer Laufer Platz 11', 'Zi. 101 ,1. OG', None]

希望有帮助.

致谢

这篇关于带有request.get()的python中的XHR请求无响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆