R:按列ID折叠所有列 [英] R: collapse all columns by an ID column

查看:142
本文介绍了R:按列ID折叠所有列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试类似于这里回答的内容。,这让我80%的方式。我有一个ID列和多个信息列的数据框。我想卷起其他列的所有,以便每个ID只有一行,多个条目分隔开,例如分号。这是一个例子,我有什么和我想要的。

I'm trying to do something similar to what's answered here, which gets me 80% of the way. I have a data frame with one ID column and multiple information columns. I'd like to roll up all of the other columns so that there's only one row for each ID, and multiple entries are separated by, for instance, a semicolon. Here's an example of what I have and what I want.

HAVE:

     ID  info1          info2
1 id101    one          first
2 id102   twoA second alias A
3 id102   twoB second alias B
4 id103 threeA  third alias A
5 id103 threeB  third alias B
6 id104   four         fourth
7 id105   five          fifth

     ID          info1                          info2
1 id101            one                          first
2 id102     twoA; twoB second alias A; second alias B
3 id103 threeA; threeB   third alias A; third alias B
4 id104           four                         fourth
5 id105           five                          fifth

这是用于生成这些代码的代码:

Here's the code used to generate those:

have <- data.frame(ID=paste0("id", c(101, 102, 102, 103, 103, 104, 105)),
                   info1=c("one", "twoA", "twoB", "threeA", "threeB", "four", "five"), 
                   info2=c("first", "second alias A", "second alias B", "third alias A", "third alias B", "fourth", "fifth"),
                   stringsAsFactors=FALSE)
want <- data_frame(ID=paste0("id", c(101:105)),
                   info1=c("one", "twoA; twoB", "threeA; threeB", "four", "five"), 
                   info2=c("first", "second alias A; second alias B", "third alias A; third alias B", "fourth", "fifth"),
                   stringsAsFactors=FALSE)

This question asked basically the same question, but only a single "info" column. I have multiple other columns and would like to do this for all of them.

使用dplyr执行此操作的积分。

Bonus points for doing this using dplyr.

推荐答案

选项使用 summarise_each (这使得轻松将更改应用于除分组变量之外的所有列)和 toString

Here's an option using summarise_each (which makes it easy to apply the changes to all columns except the grouping variables) and toString:

require(dplyr)

have %>%
  group_by(ID) %>%
  summarise_each(funs(toString))

#Source: local data frame [5 x 3]
#
#     ID          info1                          info2
#1 id101            one                          first
#2 id102     twoA, twoB second alias A, second alias B
#3 id103 threeA, threeB   third alias A, third alias B
#4 id104           four                         fourth
#5 id105           five                          fifth

或者,如果你希望用分号分隔,你可以使用: / p>

Or, if you want it separated by semicolons, you can use:

have %>%
  group_by(ID) %>%
  summarise_each(funs(paste(., collapse = "; ")))

这篇关于R:按列ID折叠所有列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆