将集合划分为“ k”个。几乎相等的片段(Scala,但与语言无关) [英] Partition a collection into "k" close-to-equal pieces (Scala, but language agnostic)
问题描述
在此代码段之前定义:
-
数据集
可以是向量
或列表
-
numberOfSlices
是Int
,表示要切片数据集的时间
dataset
can be aVector
orList
numberOfSlices
is anInt
denoting how many "times" to slice dataset
我想将数据集分成 numberOfSlices
个切片,并尽可能均匀地分布。我猜我所说的分割是指使用集合论术语的分区(所有的交集应该为空,所有的并集应该是原始的),尽管这不一定是一个集合,而只是一个任意集合。
I want to split the dataset into numberOfSlices
slices, distributed as evenly as possible. By "split" I guess I mean "partition" (intersection of all should be empty, union of all should be the original) to use the set theory term, though this is not necessarily a set, just an arbitrary collection.
例如
dataset = List(1, 2, 3, 4, 5, 6, 7)
numberOfSlices = 3
slices == ListBuffer(Vector(1, 2), Vector(3, 4), Vector(5, 6, 7))
有没有比下面的方法更好的方法了? (我什至不确定这是否是最佳选择……)
也许这不是算法上可行的尝试,在这种情况下,任何已知的良好启发式方法都可以?
Is there a better way to do it than what I have below? (which I'm not even sure is optimal...) Or perhaps this is not an algorithmically feasible endeavor, in which case any known good heuristics?
val slices = new ListBuffer[Vector[Int]]
val stepSize = dataset.length / numberOfSlices
var currentStep = 0
var looper = 0
while (looper != numberOfSlices) {
if (looper != numberOfSlices - 1) {
slices += dataset.slice(currentStep, currentStep + stepSize)
currentStep += stepSize
} else {
slices += dataset.slice(currentStep, dataset.length)
}
looper += 1
}
推荐答案
如果 xs.grouped(xs.size / n )
不适用于您,确切定义您想要的内容很容易。商是小块的大小,其余部分是大块的数目:
If the behavior of xs.grouped(xs.size / n)
doesn't work for you, it's pretty easy to define exactly what you want. The quotient is the size of the smaller pieces, and the remainder is the number of the bigger pieces:
def cut[A](xs: Seq[A], n: Int) = {
val (quot, rem) = (xs.size / n, xs.size % n)
val (smaller, bigger) = xs.splitAt(xs.size - rem * (quot + 1))
smaller.grouped(quot) ++ bigger.grouped(quot + 1)
}
这篇关于将集合划分为“ k”个。几乎相等的片段(Scala,但与语言无关)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!