numpy中itertools.combinations的N-D版本 [英] N-D version of itertools.combinations in numpy
问题描述
我想为numpy实现 itertools.combinations .根据此讨论,适用于一维输入的功能:
I would like to implement itertools.combinations for numpy. Based on this discussion, I have a function that works for 1D input:
def combs(a, r):
"""
Return successive r-length combinations of elements in the array a.
Should produce the same output as array(list(combinations(a, r))), but
faster.
"""
a = asarray(a)
dt = dtype([('', a.dtype)]*r)
b = fromiter(combinations(a, r), dt)
return b.view(a.dtype).reshape(-1, r)
输出有意义:
In [1]: list(combinations([1,2,3], 2))
Out[1]: [(1, 2), (1, 3), (2, 3)]
In [2]: array(list(combinations([1,2,3], 2)))
Out[2]:
array([[1, 2],
[1, 3],
[2, 3]])
In [3]: combs([1,2,3], 2)
Out[3]:
array([[1, 2],
[1, 3],
[2, 3]])
但是,最好将其扩展到N-D输入,其中附加的维度仅使您可以快速地一次进行多个调用.因此,从概念上讲,如果combs([1, 2, 3], 2)
产生[1, 2], [1, 3], [2, 3]
,而combs([4, 5, 6], 2)
产生[4, 5], [4, 6], [5, 6]
,则combs((1,2,3) and (4,5,6), 2)
应该产生[1, 2], [1, 3], [2, 3] and [4, 5], [4, 6], [5, 6]
,其中和"仅代表平行的行或列(以合理的方式表示). (以及其他尺寸)
However, it would be best if I could expand it to N-D inputs, where additional dimensions simply allow you to speedily do multiple calls at once. So, conceptually, if combs([1, 2, 3], 2)
produces [1, 2], [1, 3], [2, 3]
, and combs([4, 5, 6], 2)
produces [4, 5], [4, 6], [5, 6]
, then combs((1,2,3) and (4,5,6), 2)
should produce [1, 2], [1, 3], [2, 3] and [4, 5], [4, 6], [5, 6]
where "and" just represents parallel rows or columns (whichever makes sense). (and likewise for additional dimensions)
我不确定:
- 如何使尺寸以与其他函数工作方式一致的逻辑方式工作(例如某些numpy函数具有
axis=
参数以及默认值为0的轴).因此,可能0轴应该是我沿合并,其他所有轴都只代表并行计算?) - 如何使上述代码与ND配合使用(现在我得到
ValueError: setting an array element with a sequence.
) - 有没有更好的方法来做
dt = dtype([('', a.dtype)]*r)
?
- How to make the dimensions work in a logical way that's consistent with the way other functions work (like how some numpy functions have an
axis=
parameter, and a default of axis 0. So probably axis 0 should be the one I am combining along, and all other axes just represent parallel calculations?) - How to get the above code to work with ND (right now I get
ValueError: setting an array element with a sequence.
) - Is there a better way to do
dt = dtype([('', a.dtype)]*r)
?
推荐答案
您可以使用itertools.combinations()
创建索引数组,然后使用NumPy的精美索引:
You can use itertools.combinations()
to create the index array, and then use NumPy's fancy indexing:
import numpy as np
from itertools import combinations, chain
from scipy.special import comb
def comb_index(n, k):
count = comb(n, k, exact=True)
index = np.fromiter(chain.from_iterable(combinations(range(n), k)),
int, count=count*k)
return index.reshape(-1, k)
data = np.array([[1,2,3,4,5],[10,11,12,13,14]])
idx = comb_index(5, 3)
print(data[:, idx])
输出:
[[[ 1 2 3]
[ 1 2 4]
[ 1 2 5]
[ 1 3 4]
[ 1 3 5]
[ 1 4 5]
[ 2 3 4]
[ 2 3 5]
[ 2 4 5]
[ 3 4 5]]
[[10 11 12]
[10 11 13]
[10 11 14]
[10 12 13]
[10 12 14]
[10 13 14]
[11 12 13]
[11 12 14]
[11 13 14]
[12 13 14]]]
这篇关于numpy中itertools.combinations的N-D版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!