通过唯一标识符聚合并将相关值连接到字符串中 [英] Aggregating by unique identifier and concatenating related values into a string

查看:89
本文介绍了通过唯一标识符聚合并将相关值连接到字符串中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想我可以通过聚合 reshape 来满足,但是我不能完全满足找出答案。

I have a need that I imagine could be satisfied by aggregate or reshape, but I can't quite figure out.

我有一个名称列表(品牌)和随附的ID号( id )。此数据采用长格式,因此名称可以有多个ID。我想按名称( brand )进行去复制,然后将多个可能的 id 连接到一个

I have a list of names (brand), and accompanying ID number (id). This data is in long form, so names can have multiple ID's. I'd like to de-dupicate by the name (brand) and concatenate the multiple possible id's into a string separated by a comment.

例如:

brand            id 
RadioShack       2308
Rag & Bone       4466
Ragu             1830
Ragu             4518
Ralph Lauren     1638
Ralph Lauren     2719
Ralph Lauren     2720
Ralph Lauren     2721
Ralph Lauren     2722 

应变为:

RadioShack       2308
Rag & Bone       4466
Ragu             1830,4518
Ralph Lauren     1638,2719,2720,2721,2722

我将如何实现?

推荐答案

让我们调用您的data.frame DF

Let's call your data.frame DF

> aggregate(id ~ brand, data = DF, c)
         brand                           id
1   RadioShack                         2308
2   Rag & Bone                         4466
3         Ragu                   1830, 4518
4 Ralph Lauren 1638, 2719, 2720, 2721, 2722

使用 aggregate 的另一种替代方法是:

Another alternative using aggregate is:

result <- aggregate(id ~ brand, data = DF, paste, collapse = ",")

这将产生相同的结果,现在 id 不再是列表。感谢@Frank评论。要查看每列的,请尝试:

This produces the same result and now id is not a list anymore. Thanks to @Frank comment. To see the class of each column try:

> sapply(result, class)
      brand          id 
   "factor" "character"

正如@DavidArenburg在评论中提到的,另一种替代方法是使用 toString 函数:

As mentioned by @DavidArenburg in the comments, another alternative is using the toString function:

aggregate(id ~ brand, data = DF, toString)

这篇关于通过唯一标识符聚合并将相关值连接到字符串中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆