R-无重复项的扩展网格 [英] R - Expand Grid Without Duplicates

查看:15
本文介绍了R-无重复项的扩展网格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个类似于expand.grid但没有重复元素组合的函数。

这是我的问题的简化版本。

X1 = c("x","y","z")
X2 = c("A","B","C")
X3 = c("y","C","G")

d <- expand.grid(X1,X2,X3)

d
   Var1 Var2 Var3
1     x    A    y
2     y    A    y
3     z    A    y
4     x    B    y
.     .    .    .
.     .    .    .
.     .    .    .
23    y    B    G
24    z    B    G
25    x    C    G
26    y    C    G
27    z    C    G

d有27行。但是其中6个包含我不需要的重复值:2、5、8、16、17&;18

是否有办法获取不包含任何重复项的其他21行。

请注意,向量有3个以上的元素(c("x","y","z","k","m"...),最多50个),在实际情况中向量的数量超过3个。(X4X5X6.最多11个)。因为此展开的对象变得非常大,RAM无法处理它。

推荐答案

RcppAlgos*中,有一个名为comboGrid的函数可以实现此功能:

library(RcppAlgos) ## as of v2.4.3
comboGrid(X1, X2, X3, repetition = F)
#      Var1 Var2 Var3
#  [1,] "x"  "A"  "C" 
#  [2,] "x"  "A"  "G" 
#  [3,] "x"  "A"  "y" 
#  [4,] "x"  "B"  "C" 
#  [5,] "x"  "B"  "G" 
#  [6,] "x"  "B"  "y" 
#  [7,] "x"  "C"  "G" 
#  [8,] "x"  "C"  "y" 
#  [9,] "y"  "A"  "C" 
# [10,] "y"  "A"  "G" 
# [11,] "y"  "B"  "C" 
# [12,] "y"  "B"  "G" 
# [13,] "y"  "C"  "G" 
# [14,] "z"  "A"  "C" 
# [15,] "z"  "A"  "G" 
# [16,] "z"  "A"  "y" 
# [17,] "z"  "B"  "C" 
# [18,] "z"  "B"  "G" 
# [19,] "z"  "B"  "y" 
# [20,] "z"  "C"  "G" 
# [21,] "z"  "C"  "y"

大型测试

set.seed(42)
rnd_lst <- lapply(1:11, function(x) {
    sort(sample(LETTERS, sample(26, 1)))
})

## Number of results that expand.grid would return if your machine
## had enough memory... over 300 trillion!!!
prettyNum(prod(lengths(rnd_lst)), big.mark = ",")
# [1] "365,634,846,720"

exp_grd_test <- expand.grid(rnd_lst)
# Error: vector memory exhausted (limit reached?)

system.time(cmb_grd_test <- comboGrid(rnd_lst, repetition=FALSE))
#  user  system elapsed 
# 9.866   0.330  10.196 

dim(cmb_grd_test)
# [1] 3036012      11

head(cmb_grd_test)
#     Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11
# [1,] "A"  "E"  "C"  "B"  "D"  "G"  "F"  "H"  "J"  "I"   "K"  
# [2,] "A"  "E"  "C"  "B"  "D"  "G"  "F"  "H"  "J"  "I"   "L"  
# [3,] "A"  "E"  "C"  "B"  "D"  "G"  "F"  "H"  "J"  "I"   "M"  
# [4,] "A"  "E"  "C"  "B"  "D"  "G"  "F"  "H"  "J"  "I"   "N"  
# [5,] "A"  "E"  "C"  "B"  "D"  "G"  "F"  "H"  "J"  "I"   "O"  
# [6,] "A"  "E"  "C"  "B"  "D"  "G"  "F"  "H"  "J"  "I"   "P"

*我是RcppAlgos

的作者

这篇关于R-无重复项的扩展网格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆