在多处理中,不同工人的输出相同 [英] Same output in different workers in multiprocessing

查看:88
本文介绍了在多处理中,不同工人的输出相同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有非常简单的情况,可以将要完成的工作分解并分配给工人.我从此处尝试了一个非常简单的多处理示例:

I have very simple cases where the work to be done can be broken up and distributed among workers. I tried a very simple multiprocessing example from here:

import multiprocessing
import numpy as np
import time

def do_calculation(data):
    rand=np.random.randint(10)
    print data, rand
    time.sleep(rand)
    return data * 2

if __name__ == '__main__':
    pool_size = multiprocessing.cpu_count() * 2
    pool = multiprocessing.Pool(processes=pool_size)

    inputs = list(range(10))
    print 'Input   :', inputs

    pool_outputs = pool.map(do_calculation, inputs)
    print 'Pool    :', pool_outputs

上面的程序产生以下输出:

The above program produces the following output :

Input   : [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
0 7
1 7
2 7
5 7
3 7
4 7
6 7
7 7
8 6
9 6
Pool    : [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

为什么打印相同的随机数? (我的机器上有4个CPU).这是最好/最简单的方法吗?

Why is the same random number getting printed? (I have 4 cpus in my machine). Is this the best/simplest way to go ahead?

推荐答案

我认为您需要使用

I think you'll need to re-seed the random number generator using numpy.random.seed in your do_calculation function.

我的猜测是,当您导入模块时,随机数生成器(RNG)会被植入种子.然后,当您使用多处理时,您将使用已播种的RNG来分叉当前进程-因此,您的所有进程都为RNG共享相同的种子值,因此它们将生成相同的数字序列.

My guess is that the random number generator (RNG) gets seeded when you import the module. Then, when you use multiprocessing, you fork the current process with the RNG already seeded -- Thus, all your processes are sharing the same seed value for the RNG and so they'll generate the same sequences of numbers.

例如:

def do_calculation(data):
    np.random.seed()
    rand=np.random.randint(10)
    print data, rand
    return data * 2

这篇关于在多处理中,不同工人的输出相同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆