使用部分下载 (HTTP) 下载文件 [英] Download file using partial download (HTTP)

查看:36
本文介绍了使用部分下载 (HTTP) 下载文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法使用部分下载功能通过 HTTP 下载巨大且仍在增长的文件?

似乎这段代码每次执行时都是从头开始下载文件:

导入urlliburllib.urlretrieve(http://www.example.com/huge-成长文件",巨大的成长文件")

我愿意:

  1. 只获取新写入的数据
  2. 仅当源文件变小(例如已旋转)时才从头开始下载.

解决方案

可以使用范围标头进行部分下载,以下将请求选定的字节范围:

req = urllib2.Request('http://www.python.org/')req.headers['Range'] = 'bytes=%s-%s' %(开始,结束)f = urllib2.urlopen(req)

例如:

<预><代码>>>>req = urllib2.Request('http://www.python.org/')>>>req.headers['Range'] = 'bytes=%s-%s' % (100, 150)>>>f = urllib2.urlopen(req)>>>f.read()'l1-transitional.dtd"> <html xmlns="http://www.w3.'

使用此标题,您可以恢复部分下载.在您的情况下,您所要做的就是跟踪已下载的大小并请求一个新的范围.

请记住,服务器需要接受此标头才能使其正常工作.

Is there a way to download huge and still growing file over HTTP using the partial-download feature?

It seems that this code downloads file from scratch every time it executed:

import urllib
urllib.urlretrieve ("http://www.example.com/huge-growing-file", "huge-growing-file")

I'd like:

  1. To fetch just the newly-written data
  2. Download from scratch only if the source file becomes smaller (for example it has been rotated).

解决方案

It is possible to do partial download using the range header, the following will request a selected range of bytes:

req = urllib2.Request('http://www.python.org/')
req.headers['Range'] = 'bytes=%s-%s' % (start, end)
f = urllib2.urlopen(req)

For example:

>>> req = urllib2.Request('http://www.python.org/')
>>> req.headers['Range'] = 'bytes=%s-%s' % (100, 150)
>>> f = urllib2.urlopen(req)
>>> f.read()
'l1-transitional.dtd">


<html xmlns="http://www.w3.'

Using this header you can resume partial downloads. In your case all you have to do is to keep track of already downloaded size and request a new range.

Keep in mind that the server need to accept this header for this to work.

这篇关于使用部分下载 (HTTP) 下载文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆