Python urllib2 返回一个空字符串 [英] Python urllib2 returning an empty string

查看：40 发布时间：2021/9/15 18:38:54 python urllib2

本文介绍了Python urllib2 返回一个空字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试检索以下 URL:http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004.

导入 urllib2response = urllib2.urlopen('http://www.winkworth.co.uk/rent/property/terraced-house-to-rent-in-mill-road--/WOT140129')response.read()

但是我得到一个空字符串.当我通过浏览器或 cURL 尝试它时，它工作正常.知道发生了什么吗?

解决方案

我在使用 requests 库时得到响应，但在使用 urllib2 时没有响应，所以我尝试了HTTP 请求标头.

事实证明，服务器需要一个 Accept 标头；urllib2 不发送，requests 和 cURL 发送 */*.

用 urllib2 发送一个:

url = 'http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004'req = urllib2.Request(url, headers={'accept': '*/*'})响应 = urllib2.urlopen(req)

演示:

<预><代码>>>>导入 urllib2>>>url = 'http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004'>>>len(urllib2.urlopen(url).read())0>>>request = urllib2.Request(url, headers={'accept': '*/*'})>>>len(urllib2.urlopen(request).read())37197

服务器有问题；RFC 2616 指出:

<块引用>

如果不存在 Accept 头域，则假定客户端接受所有媒体类型.

I'm trying to retrieve the following URL: http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004.

import urllib2
response = urllib2.urlopen('http://www.winkworth.co.uk/rent/property/terraced-house-to-rent-in-mill-road--/WOT140129')
response.read()

However I'm getting an empty string. When I try it through the browser or with cURL it works fine. Any ideas what's going on?

解决方案

I got a response when using the requests library but not when using urllib2, so I experimented with HTTP request headers.

As it turns out, the server expects an Accept header; urllib2 doesn't send one, requests and cURL send */*.

Send one with urllib2 as well:

url = 'http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004'
req = urllib2.Request(url, headers={'accept': '*/*'})
response = urllib2.urlopen(req)

Demo:

>>> import urllib2
>>> url = 'http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004'
>>> len(urllib2.urlopen(url).read())
0
>>> request = urllib2.Request(url, headers={'accept': '*/*'})
>>> len(urllib2.urlopen(request).read())
37197

The server is at fault here; RFC 2616 states:

If no Accept header field is present, then it is assumed that the client accepts all media types.

这篇关于Python urllib2 返回一个空字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python urllib2 返回一个空字符串 [英] Python urllib2 returning an empty string

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python urllib2 返回一个空字符串 [英] Python urllib2 returning an empty string

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭