通过唯一标识符聚合并将相关值连接到字符串中 [英] Aggregating by unique identifier and concatenating related values into a string
问题描述
我想我可以通过聚合
或 reshape
来满足,但是我不能完全满足找出答案。
I have a need that I imagine could be satisfied by aggregate
or reshape
, but I can't quite figure out.
我有一个名称列表(品牌
)和随附的ID号( id
)。此数据采用长格式,因此名称可以有多个ID。我想按名称( brand
)进行去复制,然后将多个可能的 id
连接到一个
I have a list of names (brand
), and accompanying ID number (id
). This data is in long form, so names can have multiple ID's. I'd like to de-dupicate by the name (brand
) and concatenate the multiple possible id
's into a string separated by a comment.
例如:
brand id
RadioShack 2308
Rag & Bone 4466
Ragu 1830
Ragu 4518
Ralph Lauren 1638
Ralph Lauren 2719
Ralph Lauren 2720
Ralph Lauren 2721
Ralph Lauren 2722
应变为:
RadioShack 2308
Rag & Bone 4466
Ragu 1830,4518
Ralph Lauren 1638,2719,2720,2721,2722
我将如何实现?
推荐答案
让我们调用您的data.frame DF
Let's call your data.frame DF
> aggregate(id ~ brand, data = DF, c)
brand id
1 RadioShack 2308
2 Rag & Bone 4466
3 Ragu 1830, 4518
4 Ralph Lauren 1638, 2719, 2720, 2721, 2722
使用 aggregate
的另一种替代方法是:
Another alternative using aggregate
is:
result <- aggregate(id ~ brand, data = DF, paste, collapse = ",")
这将产生相同的结果,现在 id
不再是列表
。感谢@Frank评论。要查看每列的类
,请尝试:
This produces the same result and now id
is not a list
anymore. Thanks to @Frank comment. To see the class
of each column try:
> sapply(result, class)
brand id
"factor" "character"
正如@DavidArenburg在评论中提到的,另一种替代方法是使用 toString
函数:
As mentioned by @DavidArenburg in the comments, another alternative is using the toString
function:
aggregate(id ~ brand, data = DF, toString)
这篇关于通过唯一标识符聚合并将相关值连接到字符串中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!