Python 2.5 - 多线程for循环 [英] Python 2.5 - multi-threaded for loop
问题描述
我有一段代码:
get_lines(file)中的url:
visit( url,timeout = timeout)
它从文件中获取URL并访问它(通过 urllib2 )for循环。
是否可以在少数线程中执行此操作?例如,10次访问同时进行。
我试过了:
用于get_lines(file)中的url:
线程(target = visit,args =(url,),kwargs = {timeout:timeout}) .start()
但是不起作用 - 没有效果,URL正常访问。
函数访问的简化版本:
def visit(url,proxy_addr = None,timeout = 30):
(...)
request = urllib2.Request(url)
response = urllib2.urlopen(request)
return response.read()
from multiprocessing import Pool
pool = Pool(processes = 5)
pages = pool.map(visit,get_lines(file))
当map函数返回时,pages将会是li st的URL的内容。您可以调整进程的数量,使其适合您的系统。
I've got a piece of code:
for url in get_lines(file):
visit(url, timeout=timeout)
It gets URLs from file and visit it (by urllib2) in for loop.
Is is possible to do this in few threads? For example, 10 visits at the same time.
I've tried:
for url in get_lines(file):
Thread(target=visit, args=(url,), kwargs={"timeout": timeout}).start()
But it does not work - no effect, URLs are visited normally.
The simplified version of function visit:
def visit(url, proxy_addr=None, timeout=30):
(...)
request = urllib2.Request(url)
response = urllib2.urlopen(request)
return response.read()
To expand on senderle's answer, you can use the Pool class in multiprocessing to do this easily:
from multiprocessing import Pool
pool = Pool(processes=5)
pages = pool.map(visit, get_lines(file))
When the map function returns then "pages" will be a list of the contents of the URLs. You can adjust the number of processes to whatever is suitable for your system.
这篇关于Python 2.5 - 多线程for循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!