R在向量上分配权重 [英] R distribute weights over a vector

查看:79
本文介绍了R在向量上分配权重的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我在R中有一个向量

Suppose I have a vector in R

 0    1    0    0    1    0    0    0    0     1     0

向量中的任何地方都不能超过6个 1。其他所有元素均为0。

There can be no more than 6 "1"s anywhere in the vector. All other elements are 0.

我正在尝试获取所有可能的值,这些值是在每个值的 1 位置上分配 1的位置必须为<= 0.5。

I'm trying to get all possible values where I distribute "1" across the 1 positions where each value has to be <= 0.5.

因此,例如:

0    .2    0    0    .3    0    0    0    0     .5     0 . <- OK

0    .35    0    0    .4    0    0    0    0     .25     0 <- OK

但是

0    .2   0    0    .2    0    0    0    0     .6     0  <- not ok

增量可以增加0.05。

The increments can go up by 0.05.

因此,在具有3个 1的向量中,最多存在20 ^ 3个组合,其中许多组合将是不好的,因为它们的总和大于1或值大于0.5。有没有比蛮力更快的方法了?

Thus in a vector with 3 "1"'s there are at most 20^3 combinations many of which will be bad as they will sum to greater than 1 or have values >0.5. Is there a faster way than to brute force this?

编辑:
我意识到我可以拿出所有可能的权重快速使用:

I realized that I can come up with all possible weights quickly using:

temp <- expand.grid(replicate(sum(x),seq(0.05,.5,0.05), simplify=FALSE))

其中x是我的向量。

因此,现在对于每个临时人员,我想放置1所在的位置

So now for each one of those in temp I want to put in positions where the 1's are in

 0    1    0    0    1    0    0    0    0     1     0


推荐答案

编辑:正如@www在注释中指出的那样,如果您依靠浮点算法,将会错过一些组合/排列。为了解决这个问题,我们需要以整数精度工作(即代替 seq(0,0.5,0.05)我们需要 seq(0L,50L, 5L)),然后将结果除以100。

As @www points out in the comments, you will miss some combinations/permutations if you rely on floating point arithmetic. To remedy this, we need to work with integer precision (i.e. instead of seq(0, 0.5, 0.05) we need seq(0L, 50L, 5L)) and divide our results by 100.

我编写了程序包 RcppAlgos 正是针对以下问题:

I authored the package RcppAlgos that is meant precisely for problems such as these:

library(RcppAlgos)
myCombs <- comboGeneral(seq(0L,50L,5L), 6, TRUE, 
                        constraintFun = "sum", 
                        comparisonFun = "==", 
                        limitConstraints = 100L) / 100
head(myCombs, n = 10)
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    0    0    0 0.00 0.50 0.50
 [2,]    0    0    0 0.05 0.45 0.50
 [3,]    0    0    0 0.10 0.40 0.50
 [4,]    0    0    0 0.10 0.45 0.45
 [5,]    0    0    0 0.15 0.35 0.50
 [6,]    0    0    0 0.15 0.40 0.45
 [7,]    0    0    0 0.20 0.30 0.50
 [8,]    0    0    0 0.20 0.35 0.45
 [9,]    0    0    0 0.20 0.40 0.40
[10,]    0    0    0 0.25 0.25 0.50

tail(myCombs, n = 10)
       [,1] [,2] [,3] [,4] [,5] [,6]
[190,] 0.10 0.10 0.15 0.15 0.15 0.35
[191,] 0.10 0.10 0.15 0.15 0.20 0.30
[192,] 0.10 0.10 0.15 0.15 0.25 0.25
[193,] 0.10 0.10 0.15 0.20 0.20 0.25
[194,] 0.10 0.10 0.20 0.20 0.20 0.20
[195,] 0.10 0.15 0.15 0.15 0.15 0.30
[196,] 0.10 0.15 0.15 0.15 0.20 0.25
[197,] 0.10 0.15 0.15 0.20 0.20 0.20
[198,] 0.15 0.15 0.15 0.15 0.15 0.25
[199,] 0.15 0.15 0.15 0.15 0.20 0.20

如果您对排列感兴趣,没问题:

If you are interested in permutations, no problem:

myPerms <- permuteGeneral(seq(0L,50L,5L), 6, TRUE, 
                          constraintFun = "sum", 
                          comparisonFun = "==", 
                          limitConstraints = 100L) / 100

head(myPerms, n = 10)
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    0  0.0  0.0  0.0  0.5  0.5
 [2,]    0  0.0  0.0  0.5  0.0  0.5
 [3,]    0  0.0  0.0  0.5  0.5  0.0
 [4,]    0  0.0  0.5  0.0  0.0  0.5
 [5,]    0  0.0  0.5  0.0  0.5  0.0
 [6,]    0  0.0  0.5  0.5  0.0  0.0
 [7,]    0  0.5  0.0  0.0  0.0  0.5
 [8,]    0  0.5  0.0  0.0  0.5  0.0
 [9,]    0  0.5  0.0  0.5  0.0  0.0
[10,]    0  0.5  0.5  0.0  0.0  0.0

tail(myPerms, n = 10)
         [,1] [,2] [,3] [,4] [,5] [,6]
[41109,] 0.15 0.15 0.20 0.20 0.15 0.15
[41110,] 0.15 0.20 0.15 0.15 0.15 0.20
[41111,] 0.15 0.20 0.15 0.15 0.20 0.15
[41112,] 0.15 0.20 0.15 0.20 0.15 0.15
[41113,] 0.15 0.20 0.20 0.15 0.15 0.15
[41114,] 0.20 0.15 0.15 0.15 0.15 0.20
[41115,] 0.20 0.15 0.15 0.15 0.20 0.15
[41116,] 0.20 0.15 0.15 0.20 0.15 0.15
[41117,] 0.20 0.15 0.20 0.15 0.15 0.15
[41118,] 0.20 0.20 0.15 0.15 0.15 0.15

结果立即生效:

system.time(permuteGeneral(seq(0L,50L,5L), 6, TRUE, 
                           constraintFun = "sum", 
                           comparisonFun = "==", 
                           limitConstraints = 100L) / 100)
 user  system elapsed 
0.005   0.001   0.006



快速思考


可能很想将这个问题作为加法整数分区问题来解决。从 seq(0,0.5,0.05) 0:11 的映射以及从 seq(0,1,0.05) 0:20 。后者对于它为何有用的作用可能并不明显,但确实如此。有一个非常不错的软件包,名为 partitions ,该软件包带有用于生成受限分区(即给定长度的分区)的功能。


Quick Thoughts
One may be tempted to attack this problem as an additive integer partition problem. There is a mapping from seq(0, 0.5, 0.05) to 0:11 as well as a mapping from seq(0, 1, 0.05) to 0:20. The latter may not be obvious as to why it is helpful but indeed it is. There is a very nice package called partitions that comes equipped with a function for generating restricted partitions (that is, partitions of a given length).

library(partitions)
myParts <- t(as.matrix(restrictedparts(20, 6))) / 20

head(myParts)
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1.00 0.00    0    0    0    0
[2,] 0.95 0.05    0    0    0    0
[3,] 0.90 0.10    0    0    0    0
[4,] 0.85 0.15    0    0    0    0
[5,] 0.80 0.20    0    0    0    0
[6,] 0.75 0.25    0    0    0    0

您可以看到,我们已经违反了大于0.5的要求。因此,我们需要做一些额外的工作才能获得最终结果:

As you can see, we have already violated are requirement of having numbers greater than 0.5. So we have to do a little extra work to get our final result:

myMax <- apply(myParts, 1, max)
myFinalParts <- myParts[-which(myMax > 0.5), ]

head(myFinalParts)
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.50 0.50 0.00    0    0    0
[2,] 0.50 0.45 0.05    0    0    0
[3,] 0.50 0.40 0.10    0    0    0
[4,] 0.45 0.45 0.10    0    0    0
[5,] 0.50 0.35 0.15    0    0    0
[6,] 0.45 0.40 0.15    0    0    0

tail(myFinalParts, n = 10)
       [,1] [,2] [,3] [,4] [,5] [,6]
[190,] 0.35 0.15 0.15 0.15 0.10 0.10
[191,] 0.30 0.20 0.15 0.15 0.10 0.10
[192,] 0.25 0.25 0.15 0.15 0.10 0.10
[193,] 0.25 0.20 0.20 0.15 0.10 0.10
[194,] 0.20 0.20 0.20 0.20 0.10 0.10
[195,] 0.30 0.15 0.15 0.15 0.15 0.10
[196,] 0.25 0.20 0.15 0.15 0.15 0.10
[197,] 0.20 0.20 0.20 0.15 0.15 0.10
[198,] 0.25 0.15 0.15 0.15 0.15 0.15
[199,] 0.20 0.20 0.15 0.15 0.15 0.15

如您所见,我们上面有完全相同的解决方案(请参见 myCombs ),只有列的排列顺序不同。

As you can see, we have the exact same solution above (see myCombs) only the columns are in a different order.

all.equal(myCombs, myFinalParts[,6:1])
[1] TRUE

对于置换部分,这些实际上称为受限整数组成。我们可以调用 partitions :: compositions 并与上述类似地进行操作,在此我们将清除那些违反规则的行(即,丢弃包含最大值大于大于0.5)。可以使用分区来获得预期的结果,其中涉及一些额外的步骤。

For the permutation part, these are actually referred to as restricted integer compositions. We can call partitions::compositions and proceed similarly to the above where we will need to weed out those rows that break our rule (i.e. throw out rows that contain a maximum value greater than 0.5). It is possible to obtain the desired results utilizing partitions, there are just a few extra steps involved.

myComps <- t(as.matrix(compositions(20, 6))) / 20
myMax <- apply(myComps, 1, max)
temp <- myComps[-which(myMax > 0.5), ]
myFinalComps <- temp[do.call(order, as.data.frame(temp)), ]
all.equal(myPerms[do.call(order, as.data.frame(myPerms)), ], myFinalComps)
[1] TRUE

这篇关于R在向量上分配权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆