R中data.frame中的行重复 [英] Duplication of Rows in data.frame in R

查看:172
本文介绍了R中data.frame中的行重复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的data.frame,看起来类似于下面的示例:

I have a large data.frame, which looks similar to the example below:

  ID date sex grade location
1  1 2000   m     1        x
2  1 2001   m     2        y
3  2 1999   f     3        z
4  2 2000   f     4        f
5  3 2000   m     5        k
6  3 2001   m     6        l

要重现它,请运行:

df <- data.frame(ID=c(1,1,2,2,3,3),
                     date=c(2000,2001,1999,2000,2000,2001),
                     sex = c("m", "m", "f", "f", "m", "m"),
                     grade =c(1,2,3,4,5,6),
                     location =c("x","y","z", "f","k","l") )

我渴望操纵/更改我的data.frame以获得以下结构:

I am eager to manipulate/change my data.frame to get a following structure:

      ID date sex grade location
    1  1 1999   m     0        0
    2  1 2000   m     1        x
    3  1 2001   m     2        y
    4  2 1999   f     3        z
    5  2 2000   f     4        f
    6  2 2001   f     0        0
    7  3 1999   m     0        0
    8  3 2000   m     5        k
    9  3 2001   m     6        l

推荐答案

这可以通过 data.table 完成,如下所示:

This can be done with data.table like so:

library(data.table)
setDT(df, key = c("ID", "date"))

> df[CJ(ID, date, unique = TRUE)]
   ID date sex grade location
1:  1 1999  NA    NA       NA
2:  1 2000   m     1        x
3:  1 2001   m     2        y
4:  2 1999   f     3        z
5:  2 2000   f     4        f
6:  2 2001  NA    NA       NA
7:  3 1999  NA    NA       NA
8:  3 2000   m     5        k
9:  3 2001   m     6        l

如果要在 ID 中统一 sex :

df <- df[CJ(ID, date, unique = TRUE)]

df[ , sex := unique(na.omit(sex)), by = ID]

如果您真的希望 0 而不是 NA 用于等级位置(您应重新考虑,因为最好将其保留为 NA ):

If you really want 0s instead of NA for grade and location (you should reconsider this, as it's likely preferable to leave it as NA):

df[is.na(grade), grade := 0]
levels(df$location) <- c("0", levels(df$location))
df[is.na(location), location := "0"]

这篇关于R中data.frame中的行重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆