Numpy 随机选择,仅沿一个轴替换 [英] Numpy random choice, replacement only along one axis

查看:53
本文介绍了Numpy 随机选择,仅沿一个轴替换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从数组中抽取一堆点对.我希望每对包含两个 DISTINCT 点,但这些点可能会在不同的对中重复.

例如,如果我的数组是 X=np.array([1,1,2,3]),那么:

<预><代码>>>>sample_pairs(X, n=4)... [[1,1], [2,3], [1,2], [1,3]] # 这个没问题>>>sample_pairs(X, n=4)... [[1,1], [2,2], [3,3], [1,3]] # 这不行

有没有什么好的方法可以将其作为矢量化操作来完成?

解决方案

要采样没有替换的对,您可以使用 np.random.choice:

np.random.choice(X, size=2, replace=False)

或者,要一次采样多个元素,请注意所有可能的对都可以由 range(len(X)*(len(X)-1)/2) 的元素表示,并使用 np.random.randint 从中取样.

combs = np.array(list(itertools.combinations(X, 2)))样本 = np.random.randint(len(combs), size=10)梳子[样品[np.newaxis]]

跟进@user2357112 的评论,根据 OP 自己的回答,他们似乎并不关心样本量本身是否是确定性的,并注意到使用 Mersenne Twister 进行采样比基本算术运算慢,如果X 太大以至于生成组合是不可行的

sample = np.random.randint(len(X)**2, size=N)i1 = 样本//len(X)i2 = 样本 % len(X)X[np.vstack((i1, i2)).T[i1 != i2]]

这会产生一个平均大小为 N * (1 - 1/len(X)) 的样本.

I need to sample a bunch of pairs of points from an arrary. I want that each pair consists of two DISTINCT points, but the points may be repeated amongst the various pairs.

e.g., if my array is X=np.array([1,1,2,3]), then:

>>> sample_pairs(X, n=4)
... [[1,1], [2,3], [1,2], [1,3]] # this is fine
>>> sample_pairs(X, n=4)
... [[1,1], [2,2], [3,3], [1,3]] # this is not okay

Is there a good way to accomplish this as a vectorized operation?

解决方案

To sample a pair without replacements, you can use np.random.choice:

np.random.choice(X, size=2, replace=False)

Alternatively, to sample multiple elements at a time, note that all possible pairs may be represented by the elements of range(len(X)*(len(X)-1)/2), and sample from that using np.random.randint.

combs = np.array(list(itertools.combinations(X, 2)))
sample = np.random.randint(len(combs), size=10)
combs[sample[np.newaxis]]

Following up on @user2357112's comment, given from the OP's own answer that they do not appear to care if the sample size itself is deterministic, and noting that sampling with the Mersenne Twister is slower than basic arithmetic operations, a different solution if X is so large that generating the combinations is not feasibile would be

sample = np.random.randint(len(X)**2, size=N)
i1 = sample // len(X)
i2 = sample % len(X)
X[np.vstack((i1, i2)).T[i1 != i2]]

This produces a sample whose average size is N * (1 - 1/len(X)).

这篇关于Numpy 随机选择,仅沿一个轴替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆