计算R中集合的幂集(所有可能的子集)的算法 [英] Algorithm to calculate power set (all possible subsets) of a set in R

查看:16
本文介绍了计算R中集合的幂集(所有可能的子集)的算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在任何地方都找不到答案,所以这是我的解决方案.

I couldn't find an answer to this anywhere, so here's my solution.

问题是:如何计算 R 中的幂集?

The question is: how can you calculate a power set in R?

可以使用库sets"执行此操作,使用命令 2^as.set(c(1,2,3,4)),它会产生输出 {{}, {1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2,4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}, {1,2、3、4}}.但是,这使用了递归算法,速度相当慢.

It is possible to do this with the library "sets", with the command 2^as.set(c(1,2,3,4)), which yields the output {{}, {1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}, {1, 2, 3, 4}}. However, this uses a recursive algorithm, which is rather slow.

这是我想出的算法.

它是非递归的,因此它比其他一些解决方案快得多(在我的机器上比sets"包中的算法快约 100 倍).速度还是O(2^n).

It's non-recursive, so it's much faster than some of the other solutions out there (and ~100x faster on my machine than the algorithm in the "sets" package). The speed is still O(2^n).

该算法的概念基础如下:

The conceptual basis for this algorithm is the following:

for each element in the set:
    for each subset constructed so far:
        new subset = (subset + element)

这是 R 代码:

这是同一概念的更快版本;我的原始算法在这篇文章的第三条评论中.对于一组长度为 19 的机器,这个在我的机器上要快 30%.

here's a somewhat faster version of the same concept; my original algorithm is in the third comment to this post. This one is 30% faster on my machine for a set of length 19.

powerset = function(s){
    len = length(s)
    l = vector(mode="list",length=2^len) ; l[[1]]=numeric()
    counter = 1L
    for(x in 1L:length(s)){
        for(subset in 1L:counter){
            counter=counter+1L
            l[[counter]] = c(l[[subset]],s[x])
        }
    }
    return(l)
}

此版本通过在开始时以最终长度初始化向量并跟踪保存新子集的位置的计数器"变量来节省时间.也可以通过解析计算位置,但速度稍慢.

This version saves time by initiating the vector with its final length at the start and keeping track with the "counter" variable of the position at which to save new subsets. It's also possible to calculate the position analytically, but that was slightly slower.

推荐答案

子集可以看成是一个布尔向量,表示一个元素是否在子集中.这些布尔向量可以看作是用二进制编写的数字.枚举 1:n 的所有子集因此等价于枚举从 02^n-1 的数字.

A subset can be seen as a boolean vector, indicating whether an element is in the subset of not. Those boolean vectors can be seen as numbers written in binary. Enumerating all the subsets of 1:n is therefore equivalent to enumerating the numbers from 0 to 2^n-1.

f <- function(set) { 
  n <- length(set)
  masks <- 2^(1:n-1)
  lapply( 1:2^n-1, function(u) set[ bitwAnd(u, masks) != 0 ] )
}
f(LETTERS[1:4])

这篇关于计算R中集合的幂集(所有可能的子集)的算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆