R组和聚合 - 使用plyr返回组内的相对排名 [英] R group by and aggregate - return relative rank within groups using plyr

查看：154 发布时间：2017/7/13 20:54:23 r group-by aggregate plyr dplyr

本文介绍了R组和聚合 - 使用plyr返回组内的相对排名的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

更新：我有一个数据框架'test'，如下所示：

  session_id seller_feedback_score 
 1 1 282470 
 2 1 275258 
 3 1 275258 
 4 1 275258 
 5 1 37831 
 6 1 282470 
 7 1 26 
 8 1 138351 
 9 1 321350 
 10 1 841 
 11 1 138351 
 12 1 17263 
 13 1 282470 
 14 1 396900 
 15 1 282470 
 16 1 282470 
 17 1 321350 
 18 1 321350 
 19 1 321350 
 20 1 0 
 21 1 1596 
 22 7 282505 
 23 7 275283 
 24 7 275283 
 25 7 275283 
 26 7 37834 
 27 7 282505 
 28 7 26 
 29 7 138359 
 30 7 321360

和一个代码（使用包plyr），显然应该将'seller_feedback_score'每组session_id：

  test<  -  test％>％group_by（session_id）％>％
 mutate（seller_feedback_score_rank = dense_rank（ -seller_feedback_score））

然而，真正发生的是，R将整个数据帧排列在一起，而没有相关到组（session_id的）：

  session_id seller_feedback_score seller_feedback_score_rank_2 
 1 1 282470 5 
 2 1 275258 7 
 3 1 275258 7 
 4 1 275258 7 
 5 1 37831 11 
 6 1 282470 5 
 7 1 26 15 
 8 1 138351 9 
 9 1 321350 3 
 10 1 841 14 
 11 1 138351 9 
 12 1 17263 12 
 13 1 282470 5 
 14 1 396900 1 
 15 1 282470 5 
 16 1 282470 5 
 17 1 321350 3 
 18 1 321350 3 
 19 1 321350 3 
 20 1 0 16 
 21 1 1596 13 
 22 7 282505 4 
 23 7 275283 6 
 24 7 275283 6 
 25 7 275283 6 
 26 7 37834 10 
 27 7 282505 4 
 28 7 26 15 
 29 7 138359 8 
 30 7 321360 2

我c通过计算唯一的seller_feedback_score_rank值，而不是令人惊讶的是它等于最高等级值。如果有人能够重现和帮助，我将不胜感激。谢谢

解决方案

一个选项：

library（dplyr） df％>％group_by（session_id）％>％ mutate（rank = dense_rank（-seller_feedback_score）） / pre>

dense_rank 是喜欢min_rank，但排名之间没有差距，所以我否定了seller_feedback_score列要将其变成像max_rank这样的东西（在dplyr中不存在）。

如果你希望排名差距达到21，你的最低情况下，您可以使用 min_rank 而不是 dense_rank ：

  library（dplyr）
 df％>％group_by（session_id）％>％
 mutate（rank = min_rank（-seller_feedback_score））

UPDATE: I have a data frame 'test' that look like this:

    session_id  seller_feedback_score
1   1   282470
2   1   275258
3   1   275258
4   1   275258
5   1   37831
6   1   282470
7   1   26
8   1   138351
9   1   321350
10  1   841
11  1   138351
12  1   17263
13  1   282470
14  1   396900
15  1   282470
16  1   282470
17  1   321350
18  1   321350
19  1   321350
20  1   0
21  1   1596
22  7   282505
23  7   275283
24  7   275283
25  7   275283
26  7   37834
27  7   282505
28  7   26
29  7   138359
30  7   321360

and a code (using package plyr) that apparently should rank the 'seller_feedback_score' within each group of session_id:

 test <- test %>% group_by(session_id) %>% 
  mutate(seller_feedback_score_rank = dense_rank(-seller_feedback_score))

however, what is really happening is that R rank the entire data frame together without relating to the groups (session_id's):

session_id  seller_feedback_score   seller_feedback_score_rank_2
1   1   282470  5
2   1   275258  7
3   1   275258  7
4   1   275258  7
5   1   37831   11
6   1   282470  5
7   1   26  15
8   1   138351  9
9   1   321350  3
10  1   841 14
11  1   138351  9
12  1   17263   12
13  1   282470  5
14  1   396900  1
15  1   282470  5
16  1   282470  5
17  1   321350  3
18  1   321350  3
19  1   321350  3
20  1   0   16
21  1   1596    13
22  7   282505  4
23  7   275283  6
24  7   275283  6
25  7   275283  6
26  7   37834   10
27  7   282505  4
28  7   26  15
29  7   138359  8
30  7   321360  2

I checked this by counting the unique 'seller_feedback_score_rank' values and not surprisingly it equals to the highest rank value. I'd appreciate if someone could reproduce and help. thanks

解决方案

One option:

library(dplyr)
df %>% group_by(session_id) %>% 
  mutate(rank = dense_rank(-seller_feedback_score))

dense_rank is "like min_rank, but with no gaps between ranks" so I negated the seller_feedback_score column in order to turn it into something like max_rank (which doesn't exist in dplyr).

If you want the ranks with gaps so that you reach 21 for the lowest in your case, you can use min_rank instead of dense_rank:

library(dplyr)
df %>% group_by(session_id) %>% 
    mutate(rank = min_rank(-seller_feedback_score))

这篇关于R组和聚合 - 使用plyr返回组内的相对排名的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R组和聚合 - 使用plyr返回组内的相对排名 [英] R group by and aggregate - return relative rank within groups using plyr

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

R组和聚合 - 使用plyr返回组内的相对排名 [英] R group by and aggregate - return relative rank within groups using plyr

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭