python - 如何在每个计算独立的python中多处理for循环? [英] How to multiprocess for loops in python where each calculation is independent?

查看:69
本文介绍了python - 如何在每个计算独立的python中多处理for循环?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在我所做的每个小项目中学习一些新东西.我做了一个生命游戏(https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life ) 程序.

I'm trying to learn something a little new in each mini-project I do. I've made a Game of Life( https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life ) program.

这涉及一个 numpy 数组,其中数组中的每个点(单元格")都有一个整数值.要改进游戏状态,您必须为每个单元计算其所有邻居值(8 个邻居)的总和.

This involves a numpy array where each point in the array (a "cell") has an integer value. To evolve the state of the game, you have to compute for each cell the sum of all its neighbour values (8 neighbours).

我的代码中的相关类如下,其中evolve() 接受xxx_method 方法之一.它适用于 conv_methodloop_method,但我想在 loop_method 上使用多处理(我已经确定应该可以工作,不像多线程?)看到任何性能提升.我觉得它应该有效,因为每个计算都是独立的.我尝试了一种幼稚的方法,但对多处理模块的了解还不够好.我是否也可以在 evolve() 方法中使用它,因为我再次觉得双 for 循环中的每个计算都是独立的.

The relevant class in my code is as follows, where evolve() takes in one of the xxx_method methods. It works fine for conv_method and loop_method, but I want to use multiprocessing (which I've identified should work, unlike multithreading?) on loop_method to see any performance increases. I feel it should work as each calculation is independent. I've tried a naive approach, but don't really understand the multiprocessing module well enough. Could I also use it within the evolve() method, as again I feel that each calculation within the double for loops are independent.

感谢任何帮助,包括一般代码注释.

Any help appreciated, including general code comments.

编辑 - 我收到了一个 RuntimeError,我对此半信半疑,因为我对多处理的理解不够好.需要对代码做什么才能使其工作?

Edit - I'm getting a RuntimeError, which I'm half-expecting as my understanding of multiprocessing isnt good enough. What needs to be done to the code to get it work?

class GoL:
    """ Game Engine """
    def __init__(self, size):
        self.size = size
        self.grid = Grid(size) # Grid is another class ive defined

    def evolve(self, neigbour_sum_func):
        new_grid = np.zeros_like(self.grid.cells) # start with everything dead, only need to test for keeping/turning alive
        neighbour_sum_array = neigbour_sum_func()
        for i in range(self.size):
            for j in range(self.size):
                cell_sum = neighbour_sum_array[i,j]
                if self.grid.cells[i,j]: # already alive
                    if cell_sum == 2 or cell_sum == 3:
                        new_grid[i,j] = 1
                else: # test for dead coming alive
                    if cell_sum == 3:
                        new_grid[i,j] = 1

        self.grid.cells = new_grid

    def conv_method(self):
        """ Uses 2D convolution across the entire grid to work out the neighbour sum at each cell """
        kernel = np.array([
                            [1,1,1],
                            [1,0,1],
                            [1,1,1]],
                            dtype=int)
        neighbour_sum_grid = correlate2d(self.grid.cells, kernel, mode='same')
        return neighbour_sum_grid

    def loop_method(self, partition=None):
        """ Also works out neighbour sum for each cell, using a more naive loop method """
        if partition is None:
            cells = self.grid.cells # no multithreading, just work on entire grid
        else:
            cells = partition # just work on a set section of the grid

        neighbour_sum_grid = np.zeros_like(cells) # copy
        for i, row in enumerate(cells):
            for j, cell_val in enumerate(row):
                neighbours = cells[i-1:i+2, j-1:j+2]
                neighbour_sum = np.sum(neighbours) - cell_val
                neighbour_sum_grid[i,j] = neighbour_sum
        return neighbour_sum_grid

    def multi_loop_method(self):
        cores = cpu_count()
        procs = []
        slices = []
        if cores == 2: # for my VM, need to impliment generalised method for more cores
            half_grid_point = int(SQUARES / 2)
            slices.append(self.grid.cells[0:half_grid_point])
            slices.append(self.grid.cells[half_grid_point:])
        else:
            Exception

        for sl in slices:
            proc = Process(target=self.loop_method, args=(sl,))
            proc.start()
            procs.append(proc)

        for proc in procs:
            proc.join()

推荐答案

我想使用多处理(与多线程不同,我认为它应该可以工作?)

I want to use multiprocessing (which I've identified should work, unlike multithreading?)

多线程无法工作,因为它会在当前瓶颈的单个处理器上运行.多线程适用于您正在等待 API 回答的事情.在此期间,您可以进行其他计算.但是在康威的生命游戏中,您的程序一直在运行.

Multithreading would not work because it would run on a single processor which is your current bottleneck. Multithreading is for things where you are awaiting for an API to answer. In that meantime you can do other calculations. But in Conway's Game of Life your program is constantly running.

正确地进行多处理很难.如果您有 4 个处理器,您可以为每个处理器定义一个象限.但是您需要在处理器之间共享结果.有了这个,你的性能就会受到影响.它们需要同步/以相同的时钟速度运行/具有相同的更新滴答率并且需要共享结果.

Getting multiprocessing right is hard. If you have 4 processors you can define a quadrant for each of your processor. But you need to share the result between your processors. And with this you are getting a performance hit. They need to be synchronized/running on the same clock speed/have the same tick rate for updating and the result needs to be shared.

当您的网格非常大/有大量需要计算时,多处理开始变得可行.
由于这个问题非常广泛和复杂,我无法给你更好的答案.有一篇关于在 Conway's Game of Life 上获得并行处理的论文:http://www.shodor.org/media/content/petascale/materials/UPModules/GameOfLife/Life_Module_Document_pdf.pdf

Multiprocessing starts being feasible when your grid is very big/there is a ton to calculate.
Since the question is very broad and complicated I cannot give you a better answer. There is a paper on getting parallel processing on Conway's Game of Life: http://www.shodor.org/media/content/petascale/materials/UPModules/GameOfLife/Life_Module_Document_pdf.pdf

这篇关于python - 如何在每个计算独立的python中多处理for循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆