按唯一标识符聚合并将相关值连接成字符串 [英] Aggregating by unique identifier and concatenating related values into a string
问题描述
我有一个我认为可以通过 aggregate
或 reshape
来满足的需求,但我不太清楚.
I have a need that I imagine could be satisfied by aggregate
or reshape
, but I can't quite figure out.
我有一个名称列表(brand
)和随附的 ID 号(id
).此数据为长格式,因此名称可以有多个 ID.我想按名称 (brand
) 去除重复并将多个可能的 id
连接成一个由注释分隔的字符串.
I have a list of names (brand
), and accompanying ID number (id
). This data is in long form, so names can have multiple ID's. I'd like to de-dupicate by the name (brand
) and concatenate the multiple possible id
's into a string separated by a comment.
例如:
brand id
RadioShack 2308
Rag & Bone 4466
Ragu 1830
Ragu 4518
Ralph Lauren 1638
Ralph Lauren 2719
Ralph Lauren 2720
Ralph Lauren 2721
Ralph Lauren 2722
应该变成:
RadioShack 2308
Rag & Bone 4466
Ragu 1830,4518
Ralph Lauren 1638,2719,2720,2721,2722
我将如何做到这一点?
推荐答案
让我们调用你的 data.frame DF
Let's call your data.frame DF
> aggregate(id ~ brand, data = DF, c)
brand id
1 RadioShack 2308
2 Rag & Bone 4466
3 Ragu 1830, 4518
4 Ralph Lauren 1638, 2719, 2720, 2721, 2722
另一种使用 aggregate
的替代方法是:
Another alternative using aggregate
is:
result <- aggregate(id ~ brand, data = DF, paste, collapse = ",")
这会产生相同的结果,现在 id
不再是 list
.感谢@Frank 评论.要查看每列的 class
,请尝试:
This produces the same result and now id
is not a list
anymore. Thanks to @Frank comment. To see the class
of each column try:
> sapply(result, class)
brand id
"factor" "character"
正如@DavidArenburg 在评论中提到的,另一种选择是使用 toString
函数:
As mentioned by @DavidArenburg in the comments, another alternative is using the toString
function:
aggregate(id ~ brand, data = DF, toString)
这篇关于按唯一标识符聚合并将相关值连接成字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!