将列表拆分为n个随机大小的块 [英] Split a list into n randomly sized chunks

查看:70
本文介绍了将列表拆分为n个随机大小的块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将一个列表分成 n 个子列表,其中每个子列表的大小是随机的(至少有一个条目;假设 P>I).我使用了numpy.split函数,该函数可以正常运行,但不满足我的随机性条件.您可能会问随机性应遵循哪种分布.我认为,没关系.我检查了几条与我的帖子不等的帖子,因为它们试图以几乎相等大小的块进行拆分.如果重复,请告诉我.这是我的方法:

I am trying to split a list into n sublists where the size of each sublist is random (with at least one entry; assume P>I). I used numpy.split function which works fine but does not satisfy my randomness condition. You may ask which distribution the randomness should follow. I think, it should not matter. I checked several posts which were not equivalent to my post as they were trying to split with almost equally sized chunks. If duplicate, let me know. Here is my approach:

import numpy as np

P = 10
I = 5
mylist = range(1, P + 1)
[list(x) for x in np.split(np.array(mylist), I)]

P 无法被 I 整除时,这种方法将崩溃.此外,它创建大小相等的块,而不是概率大小的块.另一个限制:我不想使用包 random 但我对 numpy 没问题.不要问我为什么;我希望对此有一个合乎逻辑的答复.

This approach collapses when P is not divisible by I. Further, it creates equal sized chunks, not probabilistically sized chunks. Another constraint: I do not want to use the package random but I am fine with numpy. Don't ask me why; I wish I had a logical response for it.

根据疯狂科学家提供的答案,这是我尝试的代码:

Based on the answer provided by the mad scientist, this is the code I tried:

P = 10
I = 5

data = np.arange(P) + 1
indices = np.arange(1, P)
np.random.shuffle(indices)
indices = indices[:I - 1]
result = np.split(data, indices)
result

输出:

[array([1, 2]),
 array([3, 4, 5, 6]),
 array([], dtype=int32),
 array([4, 5, 6, 7, 8, 9]),
 array([10])]

推荐答案

np.split is still the way to go. If you pass in a sequence of integers, split will treat them as cut points. Generating random cut points is easy. You can do something like

P = 10
I = 5

data = np.arange(P) + 1
indices = np.random.randint(P, size=I - 1)

您想要 I-1 切点来获取 I 块.索引需要排序,重复项需要删除. np.unique 为您做到.这样,您最终可能会得到少于 I 个块:

You want I - 1 cut points to get I chunks. The indices need to be sorted, and duplicates need to be removed. np.unique does both for you. You may end up with fewer than I chunks this way:

result = np.split(data, indices)

如果您绝对需要 I 号,请选择不重新采样.例如,可以通过 np.shuffle :

If you absolutely need to have I numbers, choose without resampling. That can be implemented for example via np.shuffle:

indices = np.arange(1, P)
np.random.shuffle(indices)
indices = indices[:I - 1]
indices.sort()

这篇关于将列表拆分为n个随机大小的块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆