(几乎)从列表中均匀选择项目 [英] (Nearly) Evenly select items from a list

查看:68
本文介绍了(几乎)从列表中均匀选择项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个N元素的列表,我想对M (<= N)值进行采样,这些值应尽可能均匀地间隔开.更具体地说,假设选择应使采样点之间的间距差异最小化.例如,假设我正在构建布尔索引数组(即python中的)以选择元素

I have a list of N elements, and I'd like to sample M (<= N) values which are as evenly spaced as possible. To be more specific, lets say the selection should minimize the differences in the spacings between sampled points. For example, lets say I'm constructing a boolean indexing array (i.e. in python) to select elements,

我尝试了算法(来自这个类似但不同的问题:

I tried the algorithm (from this similar, but different question: How do you split a list into evenly sized chunks?) :

q, r = divmod(N, M)
indices = [q*jj + min(jj, r) for jj in range(M)]

有时效果很好:

N=11 M=6
good_index = [0 1 0 1 0 1 0 1 0 1 0]

N=14 M=6
good_index = [0 1 1 0 1 1 0 1 0 1 0 1 0 1]

在这里,第一个例子很简单,因为数组可以平均划分.第二个示例无法平均划分,但点之间的间距尽可能相似(2,2,1,1,1,1,1).

Here, the first example is trivial because the array can be evenly divided. The second example cannot be evenly divided, but the spacing between points is as similar as possible (2, 2, 1, 1, 1, 1).

但通常效果不佳:

N=16 M=10
bad_index = [0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0]

N=14 M=10
bad_index = [0 1 0 1 0 1 0 1 0 0 0 0 0 0]

因为您在最后积累了价值.

Because you have values piled up at the end.

糟糕,刚刚意识到上面的每个列表在技术上都是倒置的(0应该是1,反之亦然)....但仍然应该传达正确的想法.

Edit 1: woops, just realized each list above is technically inverted (0's should be 1's and visa-versa).... but should still convey the right idea.

上面的算法趋向于更好地工作(即,通过选择随机数进行目视检查,而不是从概念上讲像

Edit 2: the above algorithm tends to work better (i.e. visual inspection from choosing random numbers than something conceptually simpler like,

step = int(floor(N/M))
last = M * step  # this prevents us from getting M+1 elements
indices = [ii for ii in range(0, last, step)]

推荐答案

看看一些测试的结果(甚至包括上面的测试),问题出在M > N/2上. IE.当一半以上的值被采样时.但是它对于M < N/2来说非常有用.因此,我目前使用的解决方案只是在M > N/2:

Looking at the results of a few tests (even the ones included above), the problem is when M > N/2. I.e. when more than half of the values are being sampled. But it works great for M < N/2. So the solution I'm using for the moment is simply to invert the problem when M > N/2:

注意:实际上,这是为M元素创建一个大小为N的掩蔽列表,该列表为 False ,并尽可能均匀地隔开.

Note: this is actually creating a masking list of size N that is False for M elements as evenly spaced as possible.

import numpy as np

def even_select(N, M):
    if M > N/2:
        cut = np.zeros(N, dtype=int)
        q, r = divmod(N, N-M)
        indices = [q*i + min(i, r) for i in range(N-M)]
        cut[indices] = True
    else:
        cut = np.ones(N, dtype=int)
        q, r = divmod(N, M)
        indices = [q*i + min(i, r) for i in range(M)]
        cut[indices] = False

    return cut

如果有的话,我仍然会对更优雅的解决方案感兴趣.

I'd still be interested in more elegant solutions if they exist.

这篇关于(几乎)从列表中均匀选择项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆