urllib2没有检索整个HTTP响应 [英] urllib2 not retrieving entire HTTP response

查看:140
本文介绍了urllib2没有检索整个HTTP响应的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很困惑为什么我无法从 FriendFeed 下载某些JSON响应的全部内容使用 urllib2

I'm perplexed as to why I'm not able to download the entire contents of some JSON responses from FriendFeed using urllib2.

>>> import urllib2
>>> stream = urllib2.urlopen('http://friendfeed.com/api/room/the-life-scientists/profile?format=json')
>>> stream.headers['content-length']
'168928'
>>> data = stream.read()
>>> len(data)
61058
>>> # We can see here that I did not retrieve the full JSON
... # given that the stream doesn't end with a closing }
... 
>>> data[-40:]
'ce2-003048343a40","name":"Vincent Racani'

如何使用urllib2检索完整响应?

How can I retrieve the full response with urllib2?

推荐答案

获取所有数据的最佳方式:

Best way to get all of the data:

fp = urllib2.urlopen("http://www.example.com/index.cfm")

response = ""
while 1:
    data = fp.read()
    if not data:         # This might need to be    if data == "":   -- can't remember
        break
    response += data

print response

原因是,鉴于套接字的性质, .read()不能保证返回整个响应。我认为这在文档中讨论过(可能 urllib ),但我找不到它。

The reason is that .read() isn't guaranteed to return the entire response, given the nature of sockets. I thought this was discussed in the documentation (maybe urllib) but I cannot find it.

这篇关于urllib2没有检索整个HTTP响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆