按唯一标识符聚合并将相关值连接成字符串 [英] Aggregating by unique identifier and concatenating related values into a string

查看:26
本文介绍了按唯一标识符聚合并将相关值连接成字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个我认为可以通过 aggregatereshape 来满足的需求,但我不太清楚.

I have a need that I imagine could be satisfied by aggregate or reshape, but I can't quite figure out.

我有一个名称列表(brand)和随附的 ID 号(id).此数据为长格式,因此名称可以有多个 ID.我想按名称 (brand) 去除重复并将多个可能的 id 连接成一个由注释分隔的字符串.

I have a list of names (brand), and accompanying ID number (id). This data is in long form, so names can have multiple ID's. I'd like to de-dupicate by the name (brand) and concatenate the multiple possible id's into a string separated by a comment.

例如:

brand            id 
RadioShack       2308
Rag & Bone       4466
Ragu             1830
Ragu             4518
Ralph Lauren     1638
Ralph Lauren     2719
Ralph Lauren     2720
Ralph Lauren     2721
Ralph Lauren     2722 

应该变成:

RadioShack       2308
Rag & Bone       4466
Ragu             1830,4518
Ralph Lauren     1638,2719,2720,2721,2722

我将如何做到这一点?

推荐答案

让我们调用你的 data.frame DF

Let's call your data.frame DF

> aggregate(id ~ brand, data = DF, c)
         brand                           id
1   RadioShack                         2308
2   Rag & Bone                         4466
3         Ragu                   1830, 4518
4 Ralph Lauren 1638, 2719, 2720, 2721, 2722

另一种使用 aggregate 的替代方法是:

Another alternative using aggregate is:

result <- aggregate(id ~ brand, data = DF, paste, collapse = ",")

这会产生相同的结果,现在 id 不再是 list.感谢@Frank 评论.要查看每列的 class,请尝试:

This produces the same result and now id is not a list anymore. Thanks to @Frank comment. To see the class of each column try:

> sapply(result, class)
      brand          id 
   "factor" "character"

正如@DavidArenburg 在评论中提到的,另一种选择是使用 toString 函数:

As mentioned by @DavidArenburg in the comments, another alternative is using the toString function:

aggregate(id ~ brand, data = DF, toString)

这篇关于按唯一标识符聚合并将相关值连接成字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆