需要帮助以并行化python中的循环 [英] Need help to parallelize a loop in python

查看:72
本文介绍了需要帮助以并行化python中的循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个庞大的数据集,我必须为它的每个点计算一系列属性.我的代码确实很慢,我想使它更快地并行化do循环.我希望每个处理器为我的数据的有限子样本计算属性系列",然后将所有属性合并到一个数组中. 我将尝试举例说明我的工作.

I have a huge data set and I have to compute for every point of it a series of properties. My code is really slow and I would like to make it faster parallelizing somehow the do loop. I would like each processor to compute the "series of properties" for a limited subsample of my data and then join all the properties together in one array. I'll try explain what I have to do with an example.

假设我的数据集是数组x:

Let's say that my data set is the array x:

x = linspace(0,20,10000)

例如,我要获取的属性"是x的平方根:

The "property" I want to get is, for instance, the square root of x:

prop=[]
for i in arange(0,len(x)):
    prop.append(sqrt(x[i]))

问题是如何并行处理上述循环?假设我有4个处理器,我希望每个处理器都能计算10000/4 = 2500点的sqrt.

The question is how can I parallelize the above loop? Let's assume I have 4 processor and I would like each of them to compute the sqrt of 10000/4=2500 points.

我尝试查看一些multiprocessingmpi4py之类的python模块,但是从指南中我找不到这样一个简单问题的答案.

I tried looking at some python modules like multiprocessing and mpi4py but from the guides I couldn't find the answer to such a simple question.

编辑

感谢您提供的宝贵意见和链接.但是,我想澄清我的问题.我对sqrt函数不感兴趣. 我正在循环内执行一系列操作.我完全知道循环是不好的,矢量操作总是比循环更可取,但是在这种情况下,我确实必须做一个循环.我不会详细介绍我的问题,因为这会给这个问题增加不必要的复杂性. 我想拆分我的循环,以便每个处理器都可以处理一部分,这意味着我可以将代码运行40次,每次执行1/40的循环,然后合并结果,但这很愚蠢. 这是一个简单的例子

I'll thank you all for the precious comments and links you provided me. However, I would like to clarify my question. I'm not interested in the sqrt function whatsoever. I am doing a series of operations within a loop. I perfectly know loops are bad and vectorial operation are always preferable to them but in this case I really have to do a loop. I won't go into the details of my problem because this would add an unnecessary complication to this question. I would like to split my loop so that each processor does a part of it, meaning that I could run my code 40 times with 1/40 of the loop each and the merger the result but this would be stupid. This is a brief example

     for i in arange(0,len(x)):
         # do some complicated stuff

我想要的是使用40 cpus执行此操作:

What I want is use 40 cpus to do this:

    for npcu in arange(0,40):
       for i in arange(len(x)/40*ncpu,len(x)/40*(ncpu+1)):
          # do some complicated stuff

使用python是否可能?

Is that possible or not with python?

推荐答案

我不确定这是您应该做的事情,因为我希望numpy拥有更有效的解决方法,但是你只是这个意思吗?

I'm not sure that this is the way that you should do things as I'd expect numpy to have a much more efficient method of going about it, but do you just mean something like this?

import numpy
import multiprocessing

x = numpy.linspace(0,20,10000)
p = multiprocessing.Pool(processes=4)

print p.map(numpy.sqrt, x)

这是两个解决方案上的timeit的结果.但是,正如@SvenMarcach指出的那样,使用更昂贵的功能,多处理将开始更加有效.

Here are the results of timeit on both solutions. As @SvenMarcach points out, however, with a more expensive function multiprocessing will start to be much more effective.

% python -m timeit -s 'import numpy; x=numpy.linspace(0,20,10000)' 'prop=[]                                                                          
for i in numpy.arange(0,len(x)):
         prop.append(numpy.sqrt(x[i]))'
10 loops, best of 3: 31.3 msec per loop

% python -m timeit -s 'import numpy, multiprocessing; x=numpy.linspace(0,20,10000)
p = multiprocessing.Pool(processes=4)' 'l = p.map(numpy.sqrt, x)' 
10 loops, best of 3: 102 msec per loop

应Sven的要求,这是l = numpy.sqrt(x)的结果,它比任何一种替代方法都快 .

At Sven's request, here is the result of l = numpy.sqrt(x) which is significantly faster than either of the alternatives.

% python -m timeit -s 'import numpy; x=numpy.linspace(0,20,10000)' 'l = numpy.sqrt(x)'
10000 loops, best of 3: 70.3 usec per loop

这篇关于需要帮助以并行化python中的循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆