R:按列ID折叠所有列 [英] R: collapse all columns by an ID column
问题描述
我正在尝试类似于这里回答的内容。,这让我80%的方式。我有一个ID列和多个信息列的数据框。我想卷起其他列的所有,以便每个ID只有一行,多个条目分隔开,例如分号。这是一个例子,我有什么和我想要的。
I'm trying to do something similar to what's answered here, which gets me 80% of the way. I have a data frame with one ID column and multiple information columns. I'd like to roll up all of the other columns so that there's only one row for each ID, and multiple entries are separated by, for instance, a semicolon. Here's an example of what I have and what I want.
HAVE:
ID info1 info2
1 id101 one first
2 id102 twoA second alias A
3 id102 twoB second alias B
4 id103 threeA third alias A
5 id103 threeB third alias B
6 id104 four fourth
7 id105 five fifth
:
ID info1 info2
1 id101 one first
2 id102 twoA; twoB second alias A; second alias B
3 id103 threeA; threeB third alias A; third alias B
4 id104 four fourth
5 id105 five fifth
这是用于生成这些代码的代码:
Here's the code used to generate those:
have <- data.frame(ID=paste0("id", c(101, 102, 102, 103, 103, 104, 105)),
info1=c("one", "twoA", "twoB", "threeA", "threeB", "four", "five"),
info2=c("first", "second alias A", "second alias B", "third alias A", "third alias B", "fourth", "fifth"),
stringsAsFactors=FALSE)
want <- data_frame(ID=paste0("id", c(101:105)),
info1=c("one", "twoA; twoB", "threeA; threeB", "four", "five"),
info2=c("first", "second alias A; second alias B", "third alias A; third alias B", "fourth", "fifth"),
stringsAsFactors=FALSE)
This question asked basically the same question, but only a single "info" column. I have multiple other columns and would like to do this for all of them.
使用dplyr执行此操作的积分。
Bonus points for doing this using dplyr.
推荐答案
选项使用 summarise_each
(这使得轻松将更改应用于除分组变量之外的所有列)和 toString
:
Here's an option using summarise_each
(which makes it easy to apply the changes to all columns except the grouping variables) and toString
:
require(dplyr)
have %>%
group_by(ID) %>%
summarise_each(funs(toString))
#Source: local data frame [5 x 3]
#
# ID info1 info2
#1 id101 one first
#2 id102 twoA, twoB second alias A, second alias B
#3 id103 threeA, threeB third alias A, third alias B
#4 id104 four fourth
#5 id105 five fifth
或者,如果你希望用分号分隔,你可以使用: / p>
Or, if you want it separated by semicolons, you can use:
have %>%
group_by(ID) %>%
summarise_each(funs(paste(., collapse = "; ")))
这篇关于R:按列ID折叠所有列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!