复制data.frame的每一行，并指定每一行的复制次数？ [英] Replicate each row of data.frame and specify the number of replications for each row?

查看：251 发布时间：2020/10/16 21:06:08 r dataframe replication

本文介绍了复制data.frame的每一行，并指定每一行的复制次数？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在R中编程，但遇到以下问题：

I am programming in R and I got the following problem:

我有一个数据字符串jb，它很长。以下是它的一个简单版本：

I have a data String jb, that is quite long. Heres a simple version of it:

jb:    a     b     frequency               jb.expanded: a    b   
       5     3        2                                 5    3
       5     7        1                                 5    3
       9     1        40                                5    7
       12    4        5                                 9    1
       12    5        13                                9    1
                                                        ...  ...

我要复制行，复制频率是列频率。也就是说，第一行被复制两次，第二行被复制一次，依此类推。我已经用代码解决了这个问题

I want to replicate the rows and the frequency of the replication is the column frequency. That means, the first row is replicated two times, the second row is replicated 1 time and so on. I already solved that problem with the code

jb.expanded <- jb[rep(row.names(jb), jb$freqency), 1:2]

现在这是问题所在：

每当频率拐角处的任何数字大于10时，复制的列数都是错误的。例如：

Whenever any number in the frequency corner is greater than 10, the number of replicated columns is wrong. For example:

Frequency: 43 --> 14 columns
           40 --> 13 columns
           13 --> 11 columns
           14 --> 12 columns

您能帮我吗？我不知道该如何解决，也无法在互联网上找到任何东西。

Can you help me? I have no idea how to fix that, I also cannot find anything on the internet.

感谢您的帮助！

更新

重新讨论这个问题后，我觉得@Codoremifa在他们假设您的频率列可能是 的因素。


如果是这种情况，这里有一个例子。由于我不知道您的数据集中还有哪些其他级别，因此它与您的实际数据不匹配。
Here's an example if that were the case. It won't match your actual data since I don't know what other levels are in your dataset.
mydf$F2 <- factor(as.character(mydf$frequency))
## expandRows(mydf, "F2")
mydf[rep(rownames(mydf), mydf$F2), ]
#      a b frequency F2
# 1    5 3         2  2
# 1.1  5 3         2  2
# 1.2  5 3         2  2
# 2    5 7         1  1
# 3    9 1        40 40
# 3.1  9 1        40 40
# 3.2  9 1        40 40
# 3.3  9 1        40 40
# 4   12 4         5  5
# 4.1 12 4         5  5
# 4.2 12 4         5  5
# 4.3 12 4         5  5
# 4.4 12 4         5  5
# 5   12 5        13 13
# 5.1 12 5        13 13

嗯。对我来说，这看起来不像61行。为什么不？因为 rep 使用 factor 底层的数字值，在这种情况下，它与显示的值完全不同：
Hmmm. That doesn't look like 61 rows to me. Why not? Because rep uses the numeric values underlying the factor, which is quite different in this case from the displayed value: 
as.numeric(mydf$F2)
# [1] 3 1 4 5 2

要正确转换，您需要：
as.numeric(as.character(mydf$F2))
# [1]  2  1 40  5 13

 
 
 
 
 
 原始答案
 
 
 前一阵子，我写了一个函数@ Simono101的答案的概括。函数看起来像这样：




Original answer

A while ago I wrote a function that is a bit more of a generalization of @Simono101's answer. The function looks like this:
expandRows <- function(dataset, count, count.is.col = TRUE) {
  if (!isTRUE(count.is.col)) {
    if (length(count) == 1) {
      dataset[rep(rownames(dataset), each = count), ]
    } else {
      if (length(count) != nrow(dataset)) {
        stop("Expand vector does not match number of rows in data.frame")
      }
      dataset[rep(rownames(dataset), count), ]
    }
  } else {
    dataset[rep(rownames(dataset), dataset[[count]]), 
            setdiff(names(dataset), names(dataset[count]))]
  }
}

 
 
 
 
 
 出于您的目的，您可以只使用 expandRows（mydf， frequency） 
head(expandRows(mydf, "frequency"))
#     a b
# 1   5 3
# 1.1 5 3
# 2   5 7
# 3   9 1
# 3.1 9 1
# 3.2 9 1   

其他选项将重复每个选项行相同的次数：
Other options are to repeat each row the same number of times:
expandRows(mydf, 2, count.is.col=FALSE)
#      a b frequency
# 1    5 3         2
# 1.1  5 3         2
# 2    5 7         1
# 2.1  5 7         1
# 3    9 1        40
# 3.1  9 1        40
# 4   12 4         5
# 4.1 12 4         5
# 5   12 5        13
# 5.1 12 5        13

或指定重复每行多少次的向量。
Or to specify a vector of how many times to repeat each row.
expandRows(mydf, c(1, 2, 1, 0, 2), count.is.col=FALSE)
#      a b frequency
# 1    5 3         2
# 2    5 7         1
# 2.1  5 7         1
# 3    9 1        40
# 5   12 5        13
# 5.1 12 5        13

请注意以下内容中必需的 count.is.col = FALSE 参数最后两个选项。
Note the required count.is.col = FALSE argument in those last two options.

                        这篇关于复制data.frame的每一行，并指定每一行的复制次数？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

复制data.frame的每一行，并指定每一行的复制次数？ [英] Replicate each row of data.frame and specify the number of replications for each row?

问题描述

推荐答案

更新

原始答案

Original answer

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

复制data.frame的每一行，并指定每一行的复制次数？ [英] Replicate each row of data.frame and specify the number of replications for each row?

问题描述

推荐答案

更新

原始答案

Original answer

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭