在有很多输入的for循环上实现Pool [英] Implementing Pool on a for loop with a lot of inputs

查看:276
本文介绍了在有很多输入的for循环上实现Pool的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试改进我的代码(使用numba和多处理),但是我不能完全理解它,因为我的函数有很多参数.

I have been trying to improve my code (with numba and multiprocessing), but I cannot quite get it, because my function has a lot of arguments.

我已经使用其他功能对其进行了简化(见下文)...

I have already simplified it with other functions (see below)...

由于每个代理(一个类实例)在执行这些操作时都是彼此独立的,因此我想将for替换为Pool.

As each agent (a class instance) is independent of each other for these actions, I would like to replace the for with Pool.

所以我会得到一个很大的函数pooling(),我会调用它并传递代理列表

So I would get a large function pooling() that I would call and pass the list of agents

from multiprocessing import Pool

p = Pool(4)
p.map(pooling, list(agents))

但是,我在哪里将池功能所需的所有参数添加到哪里?

But, where do I ADD all the arguments that the pooling function will need?

原样:

def check_demographics(month, my_agents, families, firms, year, mortality_men, mortality_women, fertility, state_id):

    dummy = list(my_agents)
    d = str(state_id.iloc[0])

# Place where I would like to replace the LOOP. All below would be a function 

    for agent in dummy:

        if agent.get_region_id()[:2] == d:

            # Brithday
            if month % 12 == agent.month - 1:
                agent.update_age()

            # Mortality probability 
            if agent.get_gender() == 'Male':
                prob = mortality_men[mortality_men['age'] == agent.get_age()][year].iloc[0]

            # When gender is Female
            else:
                # Extract specific agent data to calculate mortality 'Female'
                prob = mortality_women[mortality_women['age'] == agent.get_age()][year].iloc[0]

            # Give birth decision 
                age = agent.get_age()
                if 14 < age < 50:
                    pregnant(agent, fertility, year, families, my_agents)

            # Mortality procedures 
            if fixed_seed.random() < prob:
                mortal(my_agents, my_graveyard, families, agent, firms)

这是我程序中最耗时的功能. 而且@jit并没有太大帮助.

It is the most time consuming function in my programme. And @jit is not helping much.

感谢一堆

推荐答案

是的,有很多参数!考虑使用一个类.

Yes, there is a lot of parameters! Consider using a class.

好吧,由于Pool.map仅支持一个可迭代的参数,因此您需要将所有内容集中在一个位置.我建议您使用门面"模式:一个中间类,用于存储所有必需的参数,并具有一个不带参数(方法)的单一方法(我称其为check).

Well, since Pool.map support only one iterable argument, you need to group everything in one place. I suggest you to use the "Facade" pattern: an intermediate class used to store all required parameters and having a single method (I call it check) without parameter (it's a method).

class Facade(object):
    def __init__(self, agent, d, families, fertility, firms, month, mortality_men, mortality_women, my_agents,
                 my_graveyard, year):
        self.agent = agent
        self.d = d
        self.families = families
        self.fertility = fertility
        self.firms = firms
        self.month = month
        self.mortality_men = mortality_men
        self.mortality_women = mortality_women
        self.my_agents = my_agents
        self.my_graveyard = my_graveyard
        self.year = year

    def check(self):
        (agent, d, families, fertility, firms,
         month, mortality_men, mortality_women,
         my_agents, my_graveyard, year) = (
            self.agent, self.d, self.families, self.fertility, self.firms,
            self.month, self.mortality_men, self.mortality_women,
            self.my_agents, self.my_graveyard, self.year)
        if agent.get_region_id()[:2] == d:

            # Brithday
            if month % 12 == agent.month - 1:
                agent.update_age()

            # Mortality probability
            if agent.get_gender() == 'Male':
                prob = mortality_men[mortality_men['age'] == agent.get_age()][year].iloc[0]

            # When gender is Female
            else:
                # Extract specific agent data to calculate mortality 'Female'
                prob = mortality_women[mortality_women['age'] == agent.get_age()][year].iloc[0]

                # Give birth decision
                age = agent.get_age()
                if 14 < age < 50:
                    pregnant(agent, fertility, year, families, my_agents)

            # Mortality procedures
            if fixed_seed.random() < prob:
                mortal(my_agents, my_graveyard, families, agent, firms)

备注:我的重构确实很丑陋,但是为了清晰起见,我想保持变量名不变.

然后您的循环可以是这样的:

Then your loop can be something like that:

def check_demographics(month, my_agents, families, firms,
                       year, mortality_men, mortality_women,
                       fertility, state_id, my_graveyard):
    d = str(state_id.iloc[0])
    pool = Pool(4)
    facades = [Facade(agent, d, families, fertility, firms,
                      month, mortality_men, mortality_women,
                      my_agents, my_graveyard, year)
               for agent in my_agents]
    pool.map(Facade.check, facades)

您说每个代理都彼此独立,但是在分析了循环之后,我看到您需要代理的完整列表(my_agents参数).在Facade类中很明显.因此,您的座席列表不得更改,并且在循环期间必须冻结每个座席的内部状态.

You said that each agent is independent of each other but, after analysing the loop, I see that you need the complete list of agents (the my_agents parameters). It's obvious with the Facade class. So your agent list must not change and the internal state of each agent must be frozen during looping.

这篇关于在有很多输入的for循环上实现Pool的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆