R在向量上分配权重 [英] R distribute weights over a vector
问题描述
假设我在R中有一个向量
Suppose I have a vector in R
0 1 0 0 1 0 0 0 0 1 0
向量中的任何地方都不能超过6个 1。其他所有元素均为0。
There can be no more than 6 "1"s anywhere in the vector. All other elements are 0.
我正在尝试获取所有可能的值,这些值是在每个值的 1 位置上分配 1的位置必须为<= 0.5。
I'm trying to get all possible values where I distribute "1" across the 1 positions where each value has to be <= 0.5.
因此,例如:
0 .2 0 0 .3 0 0 0 0 .5 0 . <- OK
0 .35 0 0 .4 0 0 0 0 .25 0 <- OK
但是
0 .2 0 0 .2 0 0 0 0 .6 0 <- not ok
增量可以增加0.05。
The increments can go up by 0.05.
因此,在具有3个 1的向量中,最多存在20 ^ 3个组合,其中许多组合将是不好的,因为它们的总和大于1或值大于0.5。有没有比蛮力更快的方法了?
Thus in a vector with 3 "1"'s there are at most 20^3 combinations many of which will be bad as they will sum to greater than 1 or have values >0.5. Is there a faster way than to brute force this?
编辑:
我意识到我可以拿出所有可能的权重快速使用:
I realized that I can come up with all possible weights quickly using:
temp <- expand.grid(replicate(sum(x),seq(0.05,.5,0.05), simplify=FALSE))
其中x是我的向量。
因此,现在对于每个临时人员,我想放置1所在的位置
So now for each one of those in temp I want to put in positions where the 1's are in
0 1 0 0 1 0 0 0 0 1 0
推荐答案
编辑:正如@www在注释中指出的那样,如果您依靠浮点算法,将会错过一些组合/排列。为了解决这个问题,我们需要以整数精度工作(即代替 seq(0,0.5,0.05)
我们需要 seq(0L,50L, 5L)
),然后将结果除以100。
As @www points out in the comments, you will miss some combinations/permutations if you rely on floating point arithmetic. To remedy this, we need to work with integer precision (i.e. instead of seq(0, 0.5, 0.05)
we need seq(0L, 50L, 5L)
) and divide our results by 100.
我编写了程序包 RcppAlgos
正是针对以下问题:
I authored the package RcppAlgos
that is meant precisely for problems such as these:
library(RcppAlgos)
myCombs <- comboGeneral(seq(0L,50L,5L), 6, TRUE,
constraintFun = "sum",
comparisonFun = "==",
limitConstraints = 100L) / 100
head(myCombs, n = 10)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0.00 0.50 0.50
[2,] 0 0 0 0.05 0.45 0.50
[3,] 0 0 0 0.10 0.40 0.50
[4,] 0 0 0 0.10 0.45 0.45
[5,] 0 0 0 0.15 0.35 0.50
[6,] 0 0 0 0.15 0.40 0.45
[7,] 0 0 0 0.20 0.30 0.50
[8,] 0 0 0 0.20 0.35 0.45
[9,] 0 0 0 0.20 0.40 0.40
[10,] 0 0 0 0.25 0.25 0.50
tail(myCombs, n = 10)
[,1] [,2] [,3] [,4] [,5] [,6]
[190,] 0.10 0.10 0.15 0.15 0.15 0.35
[191,] 0.10 0.10 0.15 0.15 0.20 0.30
[192,] 0.10 0.10 0.15 0.15 0.25 0.25
[193,] 0.10 0.10 0.15 0.20 0.20 0.25
[194,] 0.10 0.10 0.20 0.20 0.20 0.20
[195,] 0.10 0.15 0.15 0.15 0.15 0.30
[196,] 0.10 0.15 0.15 0.15 0.20 0.25
[197,] 0.10 0.15 0.15 0.20 0.20 0.20
[198,] 0.15 0.15 0.15 0.15 0.15 0.25
[199,] 0.15 0.15 0.15 0.15 0.20 0.20
如果您对排列感兴趣,没问题:
If you are interested in permutations, no problem:
myPerms <- permuteGeneral(seq(0L,50L,5L), 6, TRUE,
constraintFun = "sum",
comparisonFun = "==",
limitConstraints = 100L) / 100
head(myPerms, n = 10)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0.0 0.0 0.0 0.5 0.5
[2,] 0 0.0 0.0 0.5 0.0 0.5
[3,] 0 0.0 0.0 0.5 0.5 0.0
[4,] 0 0.0 0.5 0.0 0.0 0.5
[5,] 0 0.0 0.5 0.0 0.5 0.0
[6,] 0 0.0 0.5 0.5 0.0 0.0
[7,] 0 0.5 0.0 0.0 0.0 0.5
[8,] 0 0.5 0.0 0.0 0.5 0.0
[9,] 0 0.5 0.0 0.5 0.0 0.0
[10,] 0 0.5 0.5 0.0 0.0 0.0
tail(myPerms, n = 10)
[,1] [,2] [,3] [,4] [,5] [,6]
[41109,] 0.15 0.15 0.20 0.20 0.15 0.15
[41110,] 0.15 0.20 0.15 0.15 0.15 0.20
[41111,] 0.15 0.20 0.15 0.15 0.20 0.15
[41112,] 0.15 0.20 0.15 0.20 0.15 0.15
[41113,] 0.15 0.20 0.20 0.15 0.15 0.15
[41114,] 0.20 0.15 0.15 0.15 0.15 0.20
[41115,] 0.20 0.15 0.15 0.15 0.20 0.15
[41116,] 0.20 0.15 0.15 0.20 0.15 0.15
[41117,] 0.20 0.15 0.20 0.15 0.15 0.15
[41118,] 0.20 0.20 0.15 0.15 0.15 0.15
结果立即生效:
system.time(permuteGeneral(seq(0L,50L,5L), 6, TRUE,
constraintFun = "sum",
comparisonFun = "==",
limitConstraints = 100L) / 100)
user system elapsed
0.005 0.001 0.006
快速思考
可能很想将这个问题作为加法整数分区问题来解决。从 seq(0,0.5,0.05)
到 0:11
的映射以及从 seq(0,1,0.05)
到 0:20
。后者对于它为何有用的作用可能并不明显,但确实如此。有一个非常不错的软件包,名为 partitions
,该软件包带有用于生成受限分区(即给定长度的分区)的功能。
Quick Thoughts
One may be tempted to attack this problem as an additive integer partition problem. There is a mapping from seq(0, 0.5, 0.05)
to 0:11
as well as a mapping from seq(0, 1, 0.05)
to 0:20
. The latter may not be obvious as to why it is helpful but indeed it is. There is a very nice package called partitions
that comes equipped with a function for generating restricted partitions (that is, partitions of a given length).
library(partitions)
myParts <- t(as.matrix(restrictedparts(20, 6))) / 20
head(myParts)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1.00 0.00 0 0 0 0
[2,] 0.95 0.05 0 0 0 0
[3,] 0.90 0.10 0 0 0 0
[4,] 0.85 0.15 0 0 0 0
[5,] 0.80 0.20 0 0 0 0
[6,] 0.75 0.25 0 0 0 0
您可以看到,我们已经违反了大于0.5的要求。因此,我们需要做一些额外的工作才能获得最终结果:
As you can see, we have already violated are requirement of having numbers greater than 0.5. So we have to do a little extra work to get our final result:
myMax <- apply(myParts, 1, max)
myFinalParts <- myParts[-which(myMax > 0.5), ]
head(myFinalParts)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.50 0.50 0.00 0 0 0
[2,] 0.50 0.45 0.05 0 0 0
[3,] 0.50 0.40 0.10 0 0 0
[4,] 0.45 0.45 0.10 0 0 0
[5,] 0.50 0.35 0.15 0 0 0
[6,] 0.45 0.40 0.15 0 0 0
tail(myFinalParts, n = 10)
[,1] [,2] [,3] [,4] [,5] [,6]
[190,] 0.35 0.15 0.15 0.15 0.10 0.10
[191,] 0.30 0.20 0.15 0.15 0.10 0.10
[192,] 0.25 0.25 0.15 0.15 0.10 0.10
[193,] 0.25 0.20 0.20 0.15 0.10 0.10
[194,] 0.20 0.20 0.20 0.20 0.10 0.10
[195,] 0.30 0.15 0.15 0.15 0.15 0.10
[196,] 0.25 0.20 0.15 0.15 0.15 0.10
[197,] 0.20 0.20 0.20 0.15 0.15 0.10
[198,] 0.25 0.15 0.15 0.15 0.15 0.15
[199,] 0.20 0.20 0.15 0.15 0.15 0.15
如您所见,我们上面有完全相同的解决方案(请参见 myCombs
),只有列的排列顺序不同。
As you can see, we have the exact same solution above (see myCombs
) only the columns are in a different order.
all.equal(myCombs, myFinalParts[,6:1])
[1] TRUE
对于置换部分,这些实际上称为受限整数组成。我们可以调用 partitions :: compositions
并与上述类似地进行操作,在此我们将清除那些违反规则的行(即,丢弃包含最大值大于大于0.5)。可以使用分区来获得预期的结果,其中涉及一些额外的步骤。
For the permutation part, these are actually referred to as restricted integer compositions. We can call partitions::compositions
and proceed similarly to the above where we will need to weed out those rows that break our rule (i.e. throw out rows that contain a maximum value greater than 0.5). It is possible to obtain the desired results utilizing partitions, there are just a few extra steps involved.
myComps <- t(as.matrix(compositions(20, 6))) / 20
myMax <- apply(myComps, 1, max)
temp <- myComps[-which(myMax > 0.5), ]
myFinalComps <- temp[do.call(order, as.data.frame(temp)), ]
all.equal(myPerms[do.call(order, as.data.frame(myPerms)), ], myFinalComps)
[1] TRUE
这篇关于R在向量上分配权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!