合并具有相同ID的行并删除重复的行 [英] Combine rows with same id and delete duplicated rows

查看:88
本文介绍了合并具有相同ID的行并删除重复的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

合并一些数据后,每个ID有多行。如果数据不同,我只想保留多个SAME ID。 NA 的值应被视为等于任何colwise数据点。

After merging some data, I have multiple rows per ID. I ONLY want to keep multiple SAME ID's if the data differs. An NA value should be considered equal to any colwise data point.

df <- structure(list(id = c(1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 4L, 5L), 
    v1 = structure(c(1L, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, 1L), .Label = "a", class = "factor"), 
    v2 = structure(c(1L, 2L, 2L, 3L, 1L, 1L, 1L, 1L, NA, 1L), .Label = c("a", 
    "b", "c"), class = "factor"), v3 = structure(c(1L, 1L, 1L, 
    1L, 1L, 1L, NA, 2L, 2L, 1L), .Label = c("a", "b"), class = "factor")), .Names = c("id", 
"v1", "v2", "v3"), row.names = c(NA, -10L), class = "data.frame")



看起来像:



looks like:

   id   v1   v2   v3
    1    a    a    a
    2    a    b    a
    2 <NA>    b    a
    2    a    c    a
    3    a    a    a
    3    a    a    a
    4    a    a <NA>
    4 <NA>    a    b
    4    a <NA>    b
    5    a    a    a



所需的输出:



desired output:

   id   v1   v2   v3
    1    a    a    a
    2    a    b    a
    2    a    c    a
    3    a    a    a
    4    a    a    b
    5    a    a    a

如果存在 data.table 解决方案。

推荐答案

使用 data.table -package:

A possible solution using the data.table-package:

library(data.table)
setDT(df)[, lapply(.SD, function(x) unique(na.omit(x))), by = id]

其中:


   id v1 v2 v3
1:  1  a  a  a
2:  2  a  b  a
3:  2  a  c  a
4:  3  a  a  a
5:  4  a  a  b
6:  5  a  a  a


这篇关于合并具有相同ID的行并删除重复的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆