python中的多线程文件下载,并在shell中进行下载进度更新 [英] multithreaded file download in python and updating in shell with download progress

查看:187
本文介绍了python中的多线程文件下载,并在shell中进行下载进度更新的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试学习多线程文件下载我写了这个蛋糕:

in an attempt to learn multithreaded file download I wrote this piece of cake:

import urllib2
import os
import sys
import time
import threading

urls = ["http://broadcast.lds.org/churchmusic/MP3/1/2/nowords/271.mp3",
"http://s1.fans.ge/mp3/201109/08/John_Legend_So_High_Remix(fans_ge).mp3",
"http://megaboon.com/common/preview/track/786203.mp3"]

url = urls[1]

def downloadFile(url, saveTo=None):
    file_name = url.split('/')[-1]
    if not saveTo:
        saveTo = '/Users/userName/Desktop'
    try:
        u = urllib2.urlopen(url)
    except urllib2.URLError , er:
        print("%s" % er.reason)
    else:

        f = open(os.path.join(saveTo, file_name), 'wb')
        meta = u.info()
        file_size = int(meta.getheaders("Content-Length")[0])
        print "Downloading: %s Bytes: %s" % (file_name, file_size)
        file_size_dl = 0
        block_sz = 8192
        while True:
            buffer = u.read(block_sz)
            if not buffer:
                break

            file_size_dl += len(buffer)
            f.write(buffer)
            status = r"%10d  [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size)
            status = status + chr(8)*(len(status)+1)
            sys.stdout.write('%s\r' % status)
            time.sleep(.2)
            sys.stdout.flush()
            if file_size_dl == file_size:
                print r"Download Completed %s%% for file %s, saved to %s" % (file_size_dl * 100. / file_size, file_name, saveTo,)
        f.close()
        return


def synchronusDownload():
    urls_saveTo = {urls[0]: None, urls[1]: None, urls[2]: None}
    for url, saveTo in urls_saveTo.iteritems():
        th = threading.Thread(target=downloadFile, args=(url, saveTo), name="%s_Download_Thread" % os.path.basename(url))
        th.start()

synchronusDownload()

但是,似乎第二次下载的启动等待第一个线程,然后下载下一个文件,如同在shell中打印一样。

but it seems like for the initiation of the second download it waits for the first thread and then goes to download the next file, as printed in shell too.

我的计划是同时开始所有下载,并打印下载文件的更新进度。

my plan was to begin all downloads simultaneously and print the updated progress of the files getting downloaded.

任何帮助将不胜感激。
谢谢。

Any help will be greatly appreciated. thanks.

推荐答案

您的功能实际上是并行运行的。您可以通过在每个功能开始时进行打印来验证这一点 - 一旦程序启动,就会打印3个输出。

Your functions are actually running in parallel. You can verify this by printing at the start of each function - 3 outputs will be printed as soon as your program is started.

发生了什么,你的前两个文件是如此很小,他们在调度器切换线程之前完全下载。尝试在列表中设置更大的文件:

What's happening is your first two files are so small that they are completely downloaded before the scheduler switches threads. Try setting bigger files in your list:

urls = [
"http://www.wswd.net/testdownloadfiles/50MB.zip",
"http://www.wswd.net/testdownloadfiles/20MB.zip",
"http://www.wswd.net/testdownloadfiles/100MB.zip",
]

节目输出:

Downloading: 100MB.zip Bytes: 104857600
Downloading: 20MB.zip Bytes: 20971520
Downloading: 50MB.zip Bytes: 52428800
Download Completed 100.0% for file 20MB.zip, saved to .
Download Completed 100.0% for file 50MB.zip, saved to .
Download Completed 100.0% for file 100MB.zip, saved to .

这篇关于python中的多线程文件下载,并在shell中进行下载进度更新的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆