在R中合并数据帧的多个重复行 [英] Consolidating multiple duplicated rows of a dataframe in R
本文介绍了在R中合并数据帧的多个重复行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个由六列组成的大型数据集,第一列是与剩余五列中的比率值相匹配的标识符列表:
I have a large dataset comprised of six columns, the first column being a list of Identifiers that match ratios values across the remaining five columns:
Identifier cd_log.ratios cs_log.ratios me_log.ratios pn_log.ratios sm_log.ratios
A2ICC5 0.3784142 NA NA NA NA
A2ICC5 NA -0.4910396 NA NA NA
A2ICC5 NA NA -0.1755617 NA NA
A2ICC5 NA NA NA NA 0.2279259
A2ICC8 0.3045490 NA NA NA NA
A2ICC8 NA 0.2045638 NA NA NA
注意前四行,五个比例列中的四个共享一个重复的标识符。如何整合我的数据框以删除重复的标识符并将比例转移到一行?输出将如下所示:
Notice for the first four rows, four of the five ratio columns share a duplicated Identifier. How can I consolidate my dataframe to remove duplicated identifiers and shift the ratios to one row? The output would look something like this:
Identifier cd_log.ratios cs_log.ratios me_log.ratios pn_log.ratios sm_log.ratios
A2ICC5 0.3784142 -0.4910396 -0.1755617 NA 0.2279259
A2ICC8 0.304549 0.2045638 NA NA NA
谢谢你提前!
推荐答案
df = read.table(text = ' Identifier cd_log.ratios cs_log.ratios me_log.ratios pn_log.ratios sm_log.ratios
A2ICC5 0.3784142 NA NA NA NA
A2ICC5 NA -0.4910396 NA NA NA
A2ICC5 NA NA -0.1755617 NA NA
A2ICC5 NA NA NA NA 0.2279259
A2ICC8 0.3045490 NA NA NA NA
A2ICC8 NA 0.2045638 NA NA NA', header = T)
library(data.table)
dt = data.table(df)
dt[, lapply(.SD, na.omit), by = Identifier]
# Identifier cd_log.ratios cs_log.ratios me_log.ratios pn_log.ratios sm_log.ratios
#1: A2ICC5 0.3784142 -0.4910396 -0.1755617 NA 0.2279259
#2: A2ICC8 0.3045490 0.2045638 NA NA NA
这篇关于在R中合并数据帧的多个重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文