一个非常简单的多线程并行URL提取(无队列) [英] A very simple multithreading parallel URL fetching (without queue)

查看：137 发布时间：2020/5/13 20:34:47 python multithreading callback python-multithreading urlfetch

本文介绍了一个非常简单的多线程并行URL提取(无队列)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我花了一整天的时间寻找Python中最简单的多线程URL提取程序，但是我发现的大多数脚本都使用队列，多处理或复杂的库.

I spent a whole day looking for the simplest possible multithreaded URL fetcher in Python, but most scripts I found are using queues or multiprocessing or complex libraries.

最后，我写了一个我自己的文章，作为答复.请随时提出任何改进建议.

Finally I wrote one myself, which I am reporting as an answer. Please feel free to suggest any improvement.

我想其他人可能一直在寻找类似的东西.

I guess other people might have been looking for something similar.

推荐答案

尽可能简化原始版本:

import threading
import urllib2
import time

start = time.time()
urls = ["http://www.google.com", "http://www.apple.com", "http://www.microsoft.com", "http://www.amazon.com", "http://www.facebook.com"]

def fetch_url(url):
    urlHandler = urllib2.urlopen(url)
    html = urlHandler.read()
    print "'%s\' fetched in %ss" % (url, (time.time() - start))

threads = [threading.Thread(target=fetch_url, args=(url,)) for url in urls]
for thread in threads:
    thread.start()
for thread in threads:
    thread.join()

print "Elapsed Time: %s" % (time.time() - start)

这里唯一的新技巧是:

跟踪创建的线程.
如果您只是想知道何时完成线程，请不要理会线程计数器. join已经告诉您了.
如果不需要任何状态或外部API，则不需要Thread子类，只需一个target函数.

Keep track of the threads you create.
Don't bother with a counter of threads if you just want to know when they're all done; join already tells you that.
If you don't need any state or external API, you don't need a Thread subclass, just a target function.

这篇关于一个非常简单的多线程并行URL提取(无队列)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

一个非常简单的多线程并行URL提取(无队列) [英] A very simple multithreading parallel URL fetching (without queue)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

一个非常简单的多线程并行URL提取(无队列) [英] A very simple multithreading parallel URL fetching (without queue)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭