同时下载多个页面? [英] Download multiple pages concurrently?

查看:86
本文介绍了同时下载多个页面?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用Python编写一个脚本,该脚本可以从数据库中获取URL,并发下载网页以加快速度,而不必等待每个页面一个接一个地下载.

I'd like to write a script in Python that can grab url's from a database, and download web pages concurrently to speed things instead of waiting for each page to download one after the other.

根据此线程,Python不允许这样做,因为名为全局解释器锁定的功能,可以防止多次启动同一脚本.

According to this thread, Python doesn't allow this because of something called Global Interpreter Lock that prevents lauching the same script multiple times.

在花时间学习Twisted框架之前,我想确保没有一种更简单的方法来完成我上面要做的事情.

Before investing time learning the Twisted framework, I'd like to make sure there isn't an easier way to do what I need to do above.

谢谢你的提示.

推荐答案

不用担心GIL.就您而言,这无关紧要.

Don't worry about GIL. In your case it doesn't matter.

最简单的方法是使用 threading 模块和

Easiest way to do what you want is to create thread pool, using threading module and one of thread pool implementations from ASPN. Each thread from that pool can use httplib to download your web pages.

另一种选择是使用 PyCURL 模块-它本机支持并行下载,因此您不必不必自己实施.

Another option is to use PyCURL module -- it supports parallel downlaods natively, so you don't have to implement it yourself.

这篇关于同时下载多个页面?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆