6个位置中3个元素的排列 [英] Permutations of 3 elements within 6 positions

查看:84
本文介绍了6个位置中3个元素的排列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望在始终具有序列的条件下,在六个位置内置换(或组合) c( a, b, c)带有其他元素,例如 abcbab

I'm looking to permute (or combine) c("a","b","c") within six positions under the condition to have always sequences with alternate elements, e.g abcbab.

可以很容易地获得排列:

Permutations could easily get with:

abc<-c("a","b","c")
permutations(n=3,r=6,v=abc,repeats.allowed=T)

我认为用gtools不可能做到这一点,我已经一直在尝试为此设计一个功能-即使我认为它可能已经存在。

I think is not possible to do that with gtools, and I've been trying to design a function for that -even though I think it may already exist.

推荐答案

因为您正在寻找排列, expand.grid 可以与排列一起使用。但是,由于您不希望有邻居,因此我们可以大大缩短其维度。我认为这是合理的随机明智方法!

Since you're looking for permutations, expand.grid can work as well as permutations. But since you don't want like-neighbors, we can shorten the dimensionality of it considerably. I think this is legitimate random-wise!

预先:

r <- replicate(6, seq_len(length(abc)-1), simplify=FALSE)
r[[1]] <- c(r[[1]], length(abc))
m <- t(apply(do.call(expand.grid, r), 1, cumsum) %% length(abc) + 1)
m[] <- abc[m]
dim(m)
# [1] 96  6
head(as.data.frame(cbind(m, apply(m, 1, paste, collapse = ""))))
#   Var1 Var2 Var3 Var4 Var5 Var6     V7
# 1    b    c    a    b    c    a bcabca
# 2    c    a    b    c    a    b cabcab
# 3    a    b    c    a    b    c abcabc
# 4    b    a    b    c    a    b babcab
# 5    c    b    c    a    b    c cbcabc
# 6    a    c    a    b    c    a acabca






演练:


Walk-through:


  • 由于您希望对其进行所有循环使用,因此可以使用 gtools: :permutations ,或者我们可以使用 expand.grid ...我将使用后者,我不知道这是不是ch更快,但这确实是我需要的捷径(稍后)

  • 在处理此类约束时,我想扩展值向量的索引

  • 但是,由于我们不希望邻居相同,所以我认为, cumsum 他们;通过使用它,我们可以控制累积和重新达到相同值的能力...通过删除 0 length(abc) 从可能值的列表中,我们消除了以下可能性:(a)从不保持相同,并且(b)从不实际增加一个向量长度(重复相同的值);作为演练:

  • since you want all recycled permutations of it, we can use gtools::permutations, or we can use expand.grid ... I'll use the latter, I don't know if it's much faster, but it does a short-cut I need (more later)
  • when dealing with constraints like this, I like to expand on the indices of the vector of values
  • however, since we don't want neighbors to be the same, I thought that instead of each row of values being the straight index, we cumsum them; by using this, we can control the ability of the cumulative sum to re-reach the same value ... by removing 0 and length(abc) from the list of possible values, we remove the possibility of (a) never staying the same, and (b) never increasing actually one vector-length (repeating the same value); as a walk-through:

head(expand.grid(1:3, 1:2, 1:2, 1:2, 1:2, 1:2), n = 6)
#   Var1 Var2 Var3 Var4 Var5 Var6
# 1    1    1    1    1    1    1
# 2    2    1    1    1    1    1
# 3    3    1    1    1    1    1
# 4    1    2    1    1    1    1
# 5    2    2    1    1    1    1
# 6    3    2    1    1    1    1

由于第一个值可以是所有三个值,因此它是 1:3 ,但每个附加项都应与其相距1或2。

Since the first value can be all three values, it's 1:3, but each additional is intended to be 1 or 2 away from it.

head(t(apply(expand.grid(1:3, 1:2, 1:2, 1:2, 1:2, 1:2), 1, cumsum)), n = 6)
#      Var1 Var2 Var3 Var4 Var5 Var6
# [1,]    1    2    3    4    5    6
# [2,]    2    3    4    5    6    7
# [3,]    3    4    5    6    7    8
# [4,]    1    3    4    5    6    7
# [5,]    2    4    5    6    7    8
# [6,]    3    5    6    7    8    9

好的,这似乎没有用ful(因为它超出了向量的长度),所以我们可以调用模运算符和一个移位(因为模数返回从0开始,我们希望从1开始):

okay, that doesn't seem that useful (since it goes beyond the length of the vector), so we can invoke the modulus operator and a shift (since modulus returns 0-based, we want 1-based):

head(t(apply(expand.grid(1:3, 1:2, 1:2, 1:2, 1:2, 1:2), 1, cumsum) %% 3 + 1), n = 6)
#      Var1 Var2 Var3 Var4 Var5 Var6
# [1,]    2    3    1    2    3    1
# [2,]    3    1    2    3    1    2
# [3,]    1    2    3    1    2    3
# [4,]    2    1    2    3    1    2
# [5,]    3    2    3    1    2    3
# [6,]    1    3    1    2    3    1


  • 为验证此功能,我们可以在每一行中进行 diff 并查找 0

    m <- t(apply(expand.grid(1:3, 1:2, 1:2, 1:2, 1:2, 1:2), 1, cumsum) %% 3 + 1)
    any(apply(m, 1, diff) == 0)
    # [1] FALSE
    


  • 将此自动化转换为任意向量,我们寻求复制生成锂可能的向量st:

  • to automate this to an arbitrary vector, we enlist the help of replicate to generate the list of possible vectors:

    r <- replicate(6, seq_len(length(abc)-1), simplify=FALSE)
    r[[1]] <- c(r[[1]], length(abc))
    str(r)
    # List of 6
    #  $ : int [1:3] 1 2 3
    #  $ : int [1:2] 1 2
    #  $ : int [1:2] 1 2
    #  $ : int [1:2] 1 2
    #  $ : int [1:2] 1 2
    #  $ : int [1:2] 1 2
    

    ,然后 do.call 进行扩展。

    您拥有索引矩阵

    head(m)
    #      Var1 Var2 Var3 Var4 Var5 Var6
    # [1,]    2    3    1    2    3    1
    # [2,]    3    1    2    3    1    2
    # [3,]    1    2    3    1    2    3
    # [4,]    2    1    2    3    1    2
    # [5,]    3    2    3    1    2    3
    # [6,]    1    3    1    2    3    1
    

    ,然后将每个索引替换为向量的值:

    and then replace each index with the vector's value:

    m[] <- abc[m]
    head(m)
    #      Var1 Var2 Var3 Var4 Var5 Var6
    # [1,] "b"  "c"  "a"  "b"  "c"  "a" 
    # [2,] "c"  "a"  "b"  "c"  "a"  "b" 
    # [3,] "a"  "b"  "c"  "a"  "b"  "c" 
    # [4,] "b"  "a"  "b"  "c"  "a"  "b" 
    # [5,] "c"  "b"  "c"  "a"  "b"  "c" 
    # [6,] "a"  "c"  "a"  "b"  "c"  "a" 
    


  • 然后我们 cbind 统一字符串(通过 apply paste 粘贴)

  • and then we cbind the united string (via apply and paste)

    性能:

    library(microbenchmark)
    library(dplyr)
    library(tidyr)
    library(stringr)
    
    microbenchmark(
      tidy1 = {
        gtools::permutations(n = 3, r = 6, v = abc, repeats.allowed = TRUE) %>% 
          data.frame() %>% 
          unite(united, sep = "", remove = FALSE) %>%
          filter(!str_detect(united, "([a-c])\\1"))
      },
      tidy2 = {
          filter(unite(data.frame(gtools::permutations(n = 3, r = 6, v = abc, repeats.allowed = TRUE)),
                       united, sep = "", remove = FALSE),
                 !str_detect(united, "([a-c])\\1"))
      },
      base = {
        r <- replicate(6, seq_len(length(abc)-1), simplify=FALSE)
        r[[1]] <- c(r[[1]], length(abc))
        m <- t(apply(do.call(expand.grid, r), 1, cumsum) %% length(abc) + 1)
        m[] <- abc[m]
      },
      times=10000
    )
    # Unit: microseconds
    #   expr      min        lq     mean   median       uq       max neval
    #  tidy1 1875.400 2028.8510 2446.751 2165.651 2456.051 12790.901 10000
    #  tidy2 1745.402 1875.5015 2284.700 2000.051 2278.101 50163.901 10000
    #   base  796.701  871.4015 1020.993  919.801 1021.801  7373.901 10000
    

    我尝试了infix(non-%>%)tidy2版本只用于踢,尽管我确信从理论上讲它会更快,但我没有意识到它将削除7%以上在运行时间。 (50163可能是R垃圾收集,而不是真实的。)我们为可读性/可维护性付出的代价。

    I tried the infix (non-%>%) tidy2 version just for kicks, and though I was confident it would theoretically be faster, I didn't realize it would shave over 7% off the run-times. (The 50163 is likely R garbage-collecting, not "real".) The price we pay for readability/maintainability.

    这篇关于6个位置中3个元素的排列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆