在研发向量的置换都是唯一的枚举 [英] Permute all unique enumerations of a vector in R

查看:214
本文介绍了在研发向量的置换都是唯一的枚举的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图找到,将重排所有的唯一的向量的排列,而相同的元素类型的子集范围内不包括并列的功能。例如:

I'm trying to find a function that will permute all the unique permutations of a vector, while not counting juxtapositions within subsets of the same element type. For example:

dat <- c(1,0,3,4,1,0,0,3,0,4)

factorial(10)
> 3628800

可能的排列,但只有 10!/(2!* 2!* 4!* 2!)

factorial(10)/(factorial(2)*factorial(2)*factorial(2)*factorial(4))
> 18900

相同的元素类型的子集之内忽略并置时,唯一的排列。

unique permutations when ignoring juxtapositions within subsets of the same element type.

我可以使用独特的(),并从包中 permn()函数<$ C得到此$ C> combinat

I can get this by using unique() and the permn() function from the package combinat

unique( permn(dat) )

不过这在计算上是非常昂贵的,因为它涉及到枚举 N!,这可能是数量级以上的排列顺序不是我所需要的。有没有办法做到这一点不先计算 N!

but this is computationally very expensive, since it involves enumerating n!, which can be an order of magnitude more permutations than I need. Is there a way to do this without first computing n!?

推荐答案

编辑:这是一个更快的答案;再根据路易莎灰色和Bryce瓦格纳的想法,但更快的R $ C $ç得益于更好地利用矩阵索引。这比我原来挺快一点:

Here's a faster answer; again based on the ideas of Louisa Grey and Bryce Wagner, but with faster R code thanks to better use of matrix indexing. It's quite a bit faster than my original:

> ddd <- c(1,0,3,4,1,0,0,3,0,4)
> system.time(up1 <- uniqueperm(d))
   user  system elapsed 
  0.183   0.000   0.186 
> system.time(up2 <- uniqueperm2(d))
   user  system elapsed 
  0.037   0.000   0.038 

而code:

And the code:

uniqueperm2 <- function(d) {
  dat <- factor(d)
  N <- length(dat)
  n <- tabulate(dat)
  ng <- length(n)
  if(ng==1) return(d)
  a <- N-c(0,cumsum(n))[-(ng+1)]
  foo <- lapply(1:ng, function(i) matrix(combn(a[i],n[i]),nrow=n[i]))
  out <- matrix(NA, nrow=N, ncol=prod(sapply(foo, ncol)))
  xxx <- c(0,cumsum(sapply(foo, nrow)))
  xxx <- cbind(xxx[-length(xxx)]+1, xxx[-1])
  miss <- matrix(1:N,ncol=1)
  for(i in seq_len(length(foo)-1)) {
    l1 <- foo[[i]]
    nn <- ncol(miss)
    miss <- matrix(rep(miss, ncol(l1)), nrow=nrow(miss))
    k <- (rep(0:(ncol(miss)-1), each=nrow(l1)))*nrow(miss) + 
               l1[,rep(1:ncol(l1), each=nn)]
    out[xxx[i,1]:xxx[i,2],] <- matrix(miss[k], ncol=ncol(miss))
    miss <- matrix(miss[-k], ncol=ncol(miss))
  }
  k <- length(foo)
  out[xxx[k,1]:xxx[k,2],] <- miss
  out <- out[rank(as.numeric(dat), ties="first"),]
  foo <- cbind(as.vector(out), as.vector(col(out)))
  out[foo] <- d
  t(out)
}

有不返回相同的顺序,但排序后,结果是相同的。

It doesn't return the same order, but after sorting, the results are identical.

up1a <- up1[do.call(order, as.data.frame(up1)),]
up2a <- up2[do.call(order, as.data.frame(up2)),]
identical(up1a, up2a)

有关我的第一次尝试,看看编辑历史。

For my first attempt, see the edit history.

这篇关于在研发向量的置换都是唯一的枚举的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆