R：在数据框中插入多行（变量号） [英] R: Insert multiple rows (variable number) in data frame

查看：94 发布时间：2020/10/17 0:17:44 r dataframe transformation

本文介绍了R：在数据框中插入多行（变量号）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据行，比如说5行，用于2个可观察对象。我需要在数据帧中插入虚拟或零行，以使每个可观察的行数相同（对于较长的行，可以大于N行）。例如：

I have a data frame with, say, 5 rows, for 2 observables. I need to insert "dummy" or "zero" rows in the data frame so that number of rows per observable is the same (and can be bigger than N rows for longer one). E.g.:

#   This is what I have:
x = c("a","a","b","b","b")
y = c(2,4,5,2,6)
dft = data.frame(x,y)
print(dft)

  x y
1 a 2
2 a 4
3 b 5
4 b 2
5 b 6

这就是我想要得到的，即将每个可观察的N行添加到4。模拟出 df

Here's what I'd like to get, i.e. add N rows per observable to 4. Mock up df

x1 = c("a","a","a","a","b","b","b","b")
y1 = c(2,4,0,0,5,2,6,0)
dft1 = data.frame(x1,y1)
print(dft1)

  x1 y1
1  a  2
2  a  4
3  a  0
4  a  0
5  b  5
6  b  2
7  b  6
8  b  0

我首先使用 ddply 在每个可观察对象中获取原始数据帧中的N行，这样我就知道每个可观察对象需要添加多少行。

I started with getting the N rows in original data frame per observable with ddply, so that I know how many rows I need to add for each observable.

library(plyr)
nr = ddply(dft,.(x),summarise,val=length(x))
print(nr)

  x val
1 a   2
2 b   3 

# N extras will be 2 and 1 to reach 4 per obs. 

repl      = 4 - nr$val
repl_name = nr$x
repl_x    = rep(repl_name,repl)

print(repl_x)

[1] a a b
Levels: a b

dfa = matrix("-",nrow=sum(repl),ncol=1)
dff = data.frame(repl_x,as.data.frame(dfa))

names(dff) <- names(dft)
dft = rbind(dft,dff)
dft = dft[order(as.character(dft$x)),]

print(dft)

  x y
1 a 2
2 a 4
6 a -
7 a -
3 b 5
4 b 2
5 b 6
8 b -

我确实实现了我的目标，但是经过了许多操作和转换。

I did achieve my goal, but in quite a few operations and transformations.

所以，问题-有没有一种更简单，更快捷的方法来在几个中插入任意个空/虚拟行>放在任何数据框中。列和行的数量可以是任意数量。

So, question - is there a simpler and faster way to insert arbitrary number of empty/dummy rows in several places in any data frame. Number of columns and rows can be any.

注意：上面的代码有效，所以我确实相信这个问题不是查看我的代码类型，而是真正的类型-如何做得更好的问题。谢谢！

Note: the code above works, so I do believe this question is not a "review my code" type, but a genuine - "how to do it better" question. Thank you!

更新

一旦挑衅由Thela-the-taunter™提供，如果您想坚持使用基数R，也许可以创建如下函数：

Update

Upon provocation by Thela-the-taunter™, if you want to stick with base R, perhaps you can create a function like the following:

naRowsByGroup <- function(indf, group, rowsneeded) {
  do.call(rbind, lapply(split(indf, indf[[group]]), function(x) {
    x <- data.frame(lapply(x, `length<-`, rowsneeded))
    x[group] <- x[[group]][1]
    x
  }))
}

用法将是：

naRowsByGroup(dft, 1, 4)
#   x  y  z
# 1 a  2  2
# 2 a  4  3
# 3 a NA NA
# 4 a NA NA
# 5 b  5  4
# 6 b  2  5
# 7 b  6  6
# 8 b NA NA

样本数据：

Sample data:

x = c("a","a","b","b","b")
y = c(2,4,5,2,6)
z = c(2,3,4,5,6)
dft = data.frame(x,y,z)

这篇关于R：在数据框中插入多行（变量号）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R：在数据框中插入多行（变量号） [英] R: Insert multiple rows (variable number) in data frame

问题描述

推荐答案

更新

Update

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R：在数据框中插入多行（变量号） [英] R: Insert multiple rows (variable number) in data frame

问题描述

推荐答案

更新

Update

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭