并行处理-池-Python [英] Parallel Processing - Pool - Python

查看:83
本文介绍了并行处理-池-Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习如何在Python中使用多重处理. 我读到有关多重处理的文章,并且我试图做这样的事情:

I'm trying to learn how to use multiprocessing in Python. I read about multiprocessing, and I trying to do something like this:

我有下面的类(部分代码),该类具有生成voronoi图的方法:

I have the following class(partial code), which has a method to produce voronoi diagrams:

class ImageData:    

    def generate_voronoi_diagram(self, seeds):
    """
    Generate a voronoi diagram with *seeds* seeds
    :param seeds: the number of seed in the voronoi diagram
    """
    nx = []
    ny = []
    gs = []
    for i in range(seeds):
        # Generate a cell position
        pos_x = random.randrange(self.width)
        pos_y = random.randrange(self.height)
        nx.append(pos_x)
        ny.append(pos_y)

        # Save the f(x,y) data
        x = Utils.translate(pos_x, 0, self.width, self.range_min, self.range_max)
        y = Utils.translate(pos_y, 0, self.height, self.range_min, self.range_max)
        z = Utils.function(x, y)

        gs.append(z)

    for y in range(self.height):
        for x in range(self.width):
            # Return the Euclidean norm
            d_min = math.hypot(self.width - 1, self.height - 1)
            j = -1
            for i in range(seeds):
                # The distance from a cell to x, y point being considered
                d = math.hypot(nx[i] - x, ny[i] - y)
                if d < d_min:
                    d_min = d
                    j = i
            self.data[x][y] = gs[j]

我必须生成大量的这些图,因此,这会花费大量时间,因此我认为这是需要并行处理的典型问题. 我正在按照常规"方式执行此操作,如下所示:

I have to generate a large number of this diagrams, so, this consumes a lot of time, so I thought this is a typical problem to be parallelized. I was doing this, in the "normal" approach, like this:

if __name__ == "__main__":
    entries = []
    for n in range(images):
        entry = ImD.ImageData(width, height)
        entry.generate_voronoi_diagram(seeds)
        entry.generate_heat_map_image("ImagesOutput/Entries/Entry"+str(n))
        entries.append(entry)

尝试并行处理,我尝试了以下方法:

Trying to parallelize this, I tried this:

if __name__ == "__main__":
    entries = []
    seeds = np.random.poisson(100)
    p = Pool()
    entry = ImD.ImageData(width, height)
    res = p.apply_async(entry.generate_voronoi_diagram,(seeds))
    entries.append(entry)
    entry.generate_heat_map_image("ImagesOutput/Entries/EntryX")

但是,除了生成单个图表也不可行之外,我也不知道如何指定必须进行N次绘制.

But, besides it doesn't work not even to generate a single diagram, I don't know how to specify that this have to be made N times.

任何帮助将不胜感激. 谢谢.

Any help would be very appreciated. Thanks.

推荐答案

Python的多处理程序不共享内存(除非您明确地告诉它).这意味着您不会看到在工作进程中运行的任何功能的副作用".您的generate_voronoi_diagram方法通过将数据添加到entry值来工作,这是一个副作用.为了查看结果,您需要将其作为函数的返回值传回.

Python's multiprocessing doesn't share memory (unless you explicitly tell it to). That means that you won't see "side effects" of any function that gets run in a worker processes. Your generate_voronoi_diagram method works by adding data to an entry value, which is a side effect. In order to see the results, you need to be passing it back as a return values from your function.

这是将entry实例作为参数并返回值的一种方法:

Here's one approach that handles the entry instance as an argument and return value:

def do_voroni(entry, seeds):
    entry.generate_voronoi_diagram(seeds)
    return entry

现在,您可以在工作进程中使用此功能:

Now, you can use this function in your worker processes:

if __name__ == "__main__":
    entries = [ImD.ImageData(width, height) for _ in range(images)]
    seeds = numpy.random.poisson(100, images) # array of values

    pool = multiprocessing.Pool()
    for i, e in enumerate(pool.starmap_async(do_voroni, zip(entries, seeds))):
        e.generate_heat_map_image("ImagesOutput/Entries/Entry{:02d}".format(i))

循环中的e值不引用entries列表中的值.相反,它们是这些对象的副本,这些对象已传递到工作进程(向其中添加了数据),然后又传递回去.

The e values in the loop are not references to the values in the entries list. Rather, they're copies of those objects, which have been passed out to the worker process (which added data to them) and then passed back.

这篇关于并行处理-池-Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆