python请求http响应500(可以在浏览器中访问站点) [英] python requests http response 500 (site can be reached in browser)
问题描述
我试图弄清楚我在这里做错了什么,但我一直迷路......
在 python 2.7 中,我运行以下代码:
<预><代码>>>>进口请求>>>req = requests.request('GET', 'https://www.zomato.com/praha/caf%C3%A9-a-restaurant-z%C3%A1ti%C5%A1%C3%AD-kunratice-praha-4/每日菜单')>>>请求内容'<html><body><h1>500 服务器错误</h1>\n发生内部服务器错误.\n</body></html>\n'如果我在浏览器中打开这个,它会正确响应.我正在挖掘并发现类似的 urllib 库(500 error with urllib.request.urlopen),但是我无法适应它,我更想在这里使用请求.
我可能在这里点击了一些缺少的代理设置,例如这里的建议(Perl File::Fetch Failed HTTP response: 500 Internal Server Error),但有人可以解释一下,这个问题的正确解决方法是什么?
与浏览器请求不同的地方是 User-Agent;但是你可以使用这样的请求来改变它:
url = 'https://www.zomato.com/praha/caf%C3%A9-a-restaurant-z%C3%A1ti%C5%A1%C3%AD-kunratice-praha-4/每日菜单'headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.90 Safari/537.36'}response = requests.get(url, headers=headers)打印(response.status_code)#应该是200
编辑
某些 Web 应用程序还会检查 Origin
和/或 Referer
标头(例如 AJAX 请求);您可以以与 User-Agent
类似的方式设置这些.
headers = {'来源':'http://example.com','Referer': 'http://example.com/some_page'}
请记住,您将这些标头设置为基本上绕过检查,所以请做一个好网民,不要滥用人们的资源.
I am trying to figure out what I'm doing wrong here, but I keep getting lost...
In python 2.7, I'm running following code:
>>> import requests
>>> req = requests.request('GET', 'https://www.zomato.com/praha/caf%C3%A9-a-restaurant-z%C3%A1ti%C5%A1%C3%AD-kunratice-praha-4/daily-menu')
>>> req.content
'<html><body><h1>500 Server Error</h1>\nAn internal server error occured.\n</body></html>\n'
If I open this one in browser, it responds properly. I was digging around and found similar one with urllib library (500 error with urllib.request.urlopen), however I am not able to adapt it, even more I would like to use requests here.
I might be hitting here some missing proxy setting, as suggested for example here (Perl File::Fetch Failed HTTP response: 500 Internal Server Error), but can someone explain me, what is the proper workaround with this one?
One thing that is different with the browser request is the User-Agent; however you can alter it using requests like this:
url = 'https://www.zomato.com/praha/caf%C3%A9-a-restaurant-z%C3%A1ti%C5%A1%C3%AD-kunratice-praha-4/daily-menu'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.90 Safari/537.36'}
response = requests.get(url, headers=headers)
print(response.status_code) #should be 200
Edit
Some web applications will also check the Origin
and/or the Referer
headers (for example for AJAX requests); you can set these in a similar fashion to User-Agent
.
headers = {
'Origin': 'http://example.com',
'Referer': 'http://example.com/some_page'
}
Remember, you are setting these headers to basically bypass checks so please be a good netizen and don't abuse people's resources.
这篇关于python请求http响应500(可以在浏览器中访问站点)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!