来自不同集合的混合组合/排列 [英] mixed combinations / permutations from different sets

查看:65
本文介绍了来自不同集合的混合组合/排列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问与答; A受如何在R中的某些条件下构建置换的启发.

This Q & A is motivated by How to build permutation with some conditions in R.

到目前为止,已经有一些不错的R包,例如RcppAlgosarrangements,它们在单个集合上提供有效的组合/排列 .例如,如果我们要从letters[1:6]中选择3个项目,则以下给出了所有组合:

So far, there has been a few good R packages like RcppAlgos and arrangements offering efficient combinations / permutations on a single set. For example, the following gives all combinations if we want to choose 3 items from letters[1:6]:

library(RcppAlgos)
comboGeneral(letters[1:6], 3)
#      [,1] [,2] [,3]
# [1,] "a"  "b"  "c" 
# [2,] "a"  "b"  "d" 
# [3,] "a"  "b"  "e" 
# [4,] "a"  "b"  "f" 
# [5,] "a"  "c"  "d" 
# [6,] "a"  "c"  "e" 
# [7,] "a"  "c"  "f" 
# [8,] "a"  "d"  "e" 
# [9,] "a"  "d"  "f" 
#[10,] "a"  "e"  "f" 
#[11,] "b"  "c"  "d" 
#[12,] "b"  "c"  "e" 
#[13,] "b"  "c"  "f" 
#[14,] "b"  "d"  "e" 
#[15,] "b"  "d"  "f" 
#[16,] "b"  "e"  "f" 
#[17,] "c"  "d"  "e" 
#[18,] "c"  "d"  "f" 
#[19,] "c"  "e"  "f" 
#[20,] "d"  "e"  "f" 

但是,如果我们想要更复杂的东西,例如

However, what if we want something more complicated, like

  • LETTERS[1:2]
  • 中选择1个项目
  • letters[1:6]
  • 中选择3个项目
  • as.character(1:3)
  • 中选择2个项目
  • choose 1 item from LETTERS[1:2]
  • choose 3 items from letters[1:6]
  • choose 2 items from as.character(1:3)

如何生成所有组合以及可选的所有排列?

How to generate all combinations and optionally all permutations?

推荐答案

假设我们有一组集合set_list,其中从set_list[[i]]中选择了k[i]个项目,然后从数学上讲,我们将解决该问题例如:

Suppose we have a list of sets set_list, where k[i] items are chosen from set_list[[i]], then mathematically speaking, we would address the problem as such:

  1. 为每组生成所有组合;
  2. 合并所有集合的组合;
  3. 为每个组合创建所有排列.

下面的函数MixedCombnPerm是我的实现,对于步骤1和步骤3使用RcppAlgos.当前,步骤2未使用最佳算法.这是一种残酷的力量",它依赖于expand.grid和随后的rbind的更快实现.我知道一种更快的递归方法(例如用于在mgcv中形成张量积模型矩阵的递归方法)可以在Rcpp中进行编码,但是由于时间原因,我现在不这样做.

The function MixedCombnPerm below is my implementation, using RcppAlgos for step 1 and step 3. Currently step 2 is not using the optimal algorithm. It is a "brutal force" relying a faster implementation of expand.grid and a subsequent rbind. I know a faster recursive method (like the one used for forming a tensor product model matrix in mgcv) which can be coded up in Rcpp, but for time reason I would not do it now.

library(RcppAlgos)

MixedCombnPerm <- function (set_list, k, perm = FALSE) {

  ###################
  ## mode checking ##
  ###################

  if (!all(vapply(set_list, is.vector, TRUE)))
    stop("All sets must be 'vectors'!")

  if (length(unique(vapply(set_list, mode, ""))) > 1L)
    stop("Please ensure that all sets have the same mode!")

  ################
  ## basic math ##
  ################

  ## size of each sets
  n <- lengths(set_list, FALSE)
  ## input validation
  if (length(n) != length(k)) stop("length of 'k' different from number of sets!")
  if (any(k > n)) stop("can't choose more items than set size!")
  ## number of sets
  n_sets <- length(n)
  ## total number of items
  n_items <- sum(k)
  ## number of combinations
  n_combinations_by_set <- choose(n, k)
  n_combinations <- prod(n_combinations_by_set)

  #################################
  ## step 1: combinations by set ##
  #################################

  ## generate `n_combinations[i]` combinations on set i
  combinations_by_set <- vector("list", n_sets)
  for (i in seq_len(n_sets)) {
    ## each column of combinations_by_set[[i]] is a record
    combinations_by_set[[i]] <- t.default(comboGeneral(set_list[[i]], k[i]))
    }

  ################################
  ## step 2: merge combinations ##
  ################################

  ## merge combinations from all sets
  ## slow_expand_grid <- function (m) expand.grid(lapply(m, seq_len))
  fast_expand_grid <- function (m) {
    n_sets <- length(m)      ## number of sets
    mm <- c(1L, cumprod(m))  ## cumulative leading dimension
    grid_size <- mm[n_sets + 1L]  ## size of the grid
    grid_ind <- vector("list", n_sets)
    for (i in seq_len(n_sets)) {
      ## grid_ind[[i]] <- rep_len(rep(seq_len(m[i]), each = mm[i]), M)
      grid_ind[[i]] <- rep_len(rep.int(seq_len(m[i]), rep.int(mm[i], m[i])), grid_size)
      }
    grid_ind
    }
  grid_ind <- fast_expand_grid(n_combinations_by_set)

  ## each column is a record
  combinations_grid <- mapply(function (x, j) x[, j, drop = FALSE],
                       combinations_by_set, grid_ind,
                       SIMPLIFY = FALSE, USE.NAMES = FALSE)
  all_combinations <- do.call("rbind", combinations_grid)

  ########################################################
  ## step 3: generate permutations for each combination ##
  ########################################################

  if (!perm) return(all_combinations)
  else {
    ## generate `factorial(n_items)` permutations for each combination
    all_permutations <- vector("list", n_combinations)
    for (i in seq_len(n_combinations)) {
      all_permutations[[i]] <- permuteGeneral(all_combinations[, i], n_items)
      }
    return(all_permutations)
    }

  }

该函数进行严格的输入检查.用户应确保所有集合均以向量"形式给出,并且它们具有相同的模式.因此,对于问题中的示例,我们应提供:

The function does a strict input checking. User should ensure that all sets are given as "vector" and they have the same mode. So for the example in the question, we should provide:

## note the "as.character(1:3)"
set_list <- list(LETTERS[1:2], letters[1:6], as.character(1:3))
k <- c(1, 3, 2)

如果参数perm = FALSE(默认值),则该函数以矩阵(每列为一条记录)的形式返回组合.否则,它将返回一个矩阵列表,每个矩阵都给出特定组合的排列(每行是一条记录).

The function returns combinations in a matrix (each column is a record) if argument perm = FALSE (default). Otherwise it returns a list of matrices, each giving permutations (each row is a record) for a particular combination.

尝试示例:

combinations <- MixedCombnPerm(set_list, k)
permutations <- MixedCombnPerm(set_list, k, TRUE)

检查结果:

combinations[, 1:6]
#     [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] "A"  "B"  "A"  "B"  "A"  "B" 
#[2,] "a"  "a"  "a"  "a"  "a"  "a" 
#[3,] "b"  "b"  "b"  "b"  "b"  "b" 
#[4,] "c"  "c"  "d"  "d"  "e"  "e" 
#[5,] "1"  "1"  "1"  "1"  "1"  "1" 
#[6,] "2"  "2"  "2"  "2"  "2"  "2" 

permutations[[1]][1:6, ]
#     [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] "A"  "a"  "b"  "c"  "1"  "2" 
#[2,] "A"  "a"  "b"  "c"  "2"  "1" 
#[3,] "A"  "a"  "b"  "1"  "c"  "2" 
#[4,] "A"  "a"  "b"  "1"  "2"  "c" 
#[5,] "A"  "a"  "b"  "2"  "c"  "1" 
#[6,] "A"  "a"  "b"  "2"  "1"  "c" 

这篇关于来自不同集合的混合组合/排列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆