Python:如何使用字节范围下载文件? [英] Python: How to download file using range of bytes?

查看:318
本文介绍了Python:如何使用字节范围下载文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想以多线程模式下载文件,我在这里有以下代码:

I want to download file in multi thread mode and I have following code here:

#!/usr/bin/env python

import httplib


def main():
    url_opt = '/film/0d46e21795209bc18e9530133226cfc3/7f_Naruto.Uragannie.Hroniki.001.seriya.a1.20.06.13.mp4'

    headers = {}
    headers['Accept-Language'] = 'en-GB,en-US,en'
    headers['Accept-Encoding'] = 'gzip,deflate,sdch'
    headers['Accept-Charset'] = 'max-age=0'
    headers['Cache-Control'] = 'ISO-8859-1,utf-8,*'
    headers['Cache-Control'] = 'max-age=0'
    headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 5.1)'
    headers['Connection'] = 'keep-alive'
    headers['Accept'] = 'text/html,application/xhtml+xml,application/xml,*/*'
    headers['Range'] = ''

    conn = httplib.HTTPConnection('data09-cdn.datalock.ru:80')
    conn.request("GET", url_opt, '', headers)

    print "Request sent"

    resp = conn.getresponse()
    print resp.status
    print resp.reason
    print resp.getheaders()

    file_for_wirte = open('cartoon.mp4', 'w')
    file_for_wirte.write(resp.read())

    print resp.read()

    conn.close()


if __name__ == "__main__":
    main()

这是输出:

Request sent
200
OK
[('content-length', '62515220'), ('accept-ranges', 'bytes'), ('server', 'nginx/1.2.7'), ('last-modified', 'Thu, 20 Jun 2013 12:10:43 GMT'), ('connection', 'keep-alive'), ('date', 'Fri, 14 Feb 2014 07:53:30 GMT'), ('content-type', 'video/mp4')]

此代码工作正常,但我不明白如何使用范围下载文件的文档。如果你看到响应的输出,哪个服务器提供:

This code working perfectly however I do not understand through the documentation how to download file using ranges. If you see output of response, which server provides:

 ('content-length', '62515220'), ('accept-ranges', 'bytes')

它支持'bytes'单位范围内容大小为62515220

It supports range in 'bytes' unit where content size is 62515220

但是在此请求中,整个文件已下载。但是我想要做的是首先获取服务器信息,比如使用http范围查询和文件内容大小而不下载可以支持此文件吗?我如何用范围创建http查询(即:0~25000)?

However in this request whole file downloaded. But what I want to do first obtain server information like does this file can be supported using http range queries and content size of file with out downloading? And how I can create http query with range (i.e.: 0~25000)?

推荐答案

传递 范围 标头,带字节= start_offset-end_offset 作为范围说明符。

Pass Range header with bytes=start_offset-end_offset as range specifier.

例如,以下代码检索前300个字节。 ( 0-299 ):

For example, following code retrieve the first 300 bytes. (0-299):

>>> import httplib
>>> conn = httplib.HTTPConnection('localhost')
>>> conn.request("GET", '/', headers={'Range': 'bytes=0-299'}) # <----
>>> resp = conn.getresponse()
>>> resp.status
206
>>> resp.status == httplib.PARTIAL_CONTENT
True
>>> resp.getheader('content-range')
'bytes 0-299/612'
>>> content = resp.read()
>>> len(content)
300

注意两者 start_offset end_offset 包含在内。

UPDATE

如果服务器不理解 Range 标题,它将回复状态码200( httplib.OK )而不是206( httplib.PARTIAL_CONTENT ),它将发送整个内容。要确保服务器回复部分内容,请检查状态代码。

If the server does not understand Range header, it will respond with the status code 200 (httplib.OK) instead of 206 (httplib.PARTIAL_CONTENT), and it will send whole content. To make sure the server respond partial content, check the status code.

>>> resp.status == httplib.PARTIAL_CONTENT
True

这篇关于Python:如何使用字节范围下载文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆