在python中加快itertools.product的方法 [英] Ways of speeding up the itertools.product in python

查看:217
本文介绍了在python中加快itertools.product的方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用itertools.product创建一个由所有可能的资产分配组成的numpy数组. 条件是每个资产的分配范围可以在0到100%之间,并且可以增加(100%/资产数量)增量.分配总金额应为100%.

I'm trying to create a numpy array consisting of all possible asset allocations using itertools.product. The conditions are that allocations for each asset can be in range of zero to 100% and can rise by (100% / number of assets) increments. The allocations total sum should be 100%.

资产数量增加时,计算将花费很长时间(7个资产需要10秒,8个资产需要210秒,依此类推). 有没有办法以某种方式加快代码的速度? 也许我应该尝试使用it.takewhile或多处理?

The calculations take very long time when assets number grows (10 seconds for 7 assets, 210 seconds for 8 assets and so on). Is there a way to speed up the code somehow? Maybe i should try using it.takewhile or multiprocessing?

import itertools as it
import numpy as np

def CreateMatrix(Increments):

    inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
    matrix = np.ndarray((1, int(1/Increments)));
    x = 0;
    for i in inputs:
        if np.sum(i, axis = 0) == 1:
            if x > 0:
                matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
            matrix[x] = i
            x = x + 1

    return matrix

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);
print matrix

推荐答案

使用stdlib sum代替numpy.sum.据cPro​​file称,这段代码大部分时间都在计算该总和.

Use the stdlib sum instead of numpy.sum. This code spends most of its time computing that sum, according to cProfile.

配置代码

import cProfile, pstats, StringIO
import itertools as it
import numpy as np


def CreateMatrix(Increments):
    inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
    matrix = np.ndarray((1, int(1/Increments)));
    x = 0
    for i in inputs:
        if np.sum(i, axis=0) == 1:
            if x > 0:
                matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
            matrix[x] = i
            x += 1
    return matrix

pr = cProfile.Profile()
pr.enable()
Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);
print matrix
pr.disable()
s = StringIO.StringIO()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
ps.print_stats()
print s.getvalue()

截断的输出

         301565912 function calls (301565864 primitive calls) in 294.255 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   26.294   26.294  294.254  294.254 product.py:7(CreateMatrix)
 43046721   41.948    0.000  267.762    0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:1966(sum)
 43046723   60.071    0.000  217.863    0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:69(_wrapreduction)
 43046723  124.341    0.000  124.341    0.000 {method 'reduce' of 'numpy.ufunc' objects}
 43046723   14.630    0.000   14.630    0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:70(<dictcomp>)
 43046721   12.629    0.000   12.629    0.000 {getattr}
 43098200    7.958    0.000    7.958    0.000 {isinstance}
 43046724    6.191    0.000    6.191    0.000 {method 'items' of 'dict' objects}
     6434    0.047    0.000    0.199    0.000 Library/Python/2.7/lib/python/site-packages/numpy/lib/index_tricks.py:316(__getitem__)

定时实验

numpy.sum

import itertools as it
import numpy as np

def CreateMatrix(Increments):

    inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
    matrix = np.ndarray((1, int(1/Increments)));
    x = 0;
    for i in inputs:
        if np.sum(i, axis = 0) == 1:
            if x > 0:
                matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
            matrix[x] = i
            x = x + 1

    return matrix

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);

$ python -m timeit --number=3 --verbose "$(cat product.py)"
raw times: 738 696 697
3 loops, best of 3: 232 sec per loop

Stdlib sum

import itertools as it
import numpy as np

def CreateMatrix(Increments):

    inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
    matrix = np.ndarray((1, int(1/Increments)));
    x = 0;
    for i in inputs:
        if sum(i) == 1:
            if x > 0:
                matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
            matrix[x] = i
            x = x + 1

    return matrix

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);

$ python -m timeit --number=3 --verbose "$(cat product.py)"
raw times: 90.5 84.3 85.3
3 loops, best of 3: 28.1 sec per loop

正如其他人在评论中所说,还有更多方法可以更快地获得解决方案.看看如何进行多进程"处理, itertools产品模块?,了解如何使用multiprocessing来加快速度.无论您做什么:聪明的算法,并发或同时使用两者,请替换sum函数;只需很少的努力就可以大大提高速度.

There are many more ways to get your solution faster, as other folks have said in their comments. Take a look at How do I "multi-process" the itertools product module? for an idea of how to use multiprocessing to speed this up. No matter what you do: clever algorithm, concurrency or both, replace the sum function; it's a lot of speed up for very little effort.

这篇关于在python中加快itertools.product的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆