如何基于另一列定义的组来计算一列的排名？ [英] How to calculate ranking of one column based on groups defined by another column?

查看：130 发布时间：2015/11/30 22:16:31 algorithm r data statistics

本文介绍了如何基于另一列定义的组来计算一列的排名？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

研究2.11.1版本在Windows 7 32位

我得到一个数据设置如下：

  USER_A USER_B SCORE
1 6 0.2
1 7 0.1
1 10 0.15
2 6 0.2
2月9日0.12
3 8 0.15
3月9日0.3

的USER_A为1：3和USER_B是6时10分。现在我需要输出USER_B的排名USER_A他们的分数：

  USER_B的USER_A排名
USER_B 6,7,10（属于USER_A 1）1 3 1 2＃系统排名
的USER_B 6,9 2 2 1＃系统排名（属于USER_A 2）
的USER_B 8,9 3 1 2＃系统排名（属于USER_A 3）

其实，我只需要输出排名：

  3 1 2
2 1
1 2

这是不高兴，因为每一行的长度是不同的！我无法将它们存储在一个矩阵，然后输出它们。

谁能帮我解决这个问题？非常感谢！

解决方案

  DF＆LT;  -  read.table（CON＆LT;  -  textConnection（USER_A USER_B SCORE
1 6 0.2
1 7 0.1
1 10 0.15
2 6 0.2
2月9日0.12
3 8 0.15
3月9日0.3
），首标= TRUE）
关闭（CON）

的一种方法是将数据拆分：

  SDF＆LT;  - 与（DF，分割（SCORE，F = USER_A））
lapply（SDF，排名）

最后一行给出：

 ＆GT; lapply（SDF，排名）
$`1`
[1] 3 1 2

$`2`
[1] 2 1

$`3`
[1] 1 2

另一种方法是使用合计（）如：

 合计（SCORE〜USER_A，数据= DF，排名）

它返回：

 ＆GT; （FOO＆LT;  - 集料（SCORE〜USER_A，数据= DF，职级））
  USER_A SCORE
1 1 3，1，2
2 2 2，1
3 3 1,2

但输出的是一个有点不同，这里，现在我们有一个数据帧，与第二部件评分是一个列表，就像 lapply（）版本输出：

 ＆GT; STR（富）
data.frame：3观测。 ：2变量
 $ USER_A：INT 1 2 3
 $ SCORE：3名单
  .. $ 0：号码3 1 2
  .. $ 1：号码2 1
  .. $ 2：NUM 1 2
＆GT; FOO $ SCORE
$`0`
[1] 3 1 2

$`1`
[1] 2 1

$`2`
[1] 1 2

R Version 2.11.1 32-bit on Windows 7

I get a data set as below:

USER_A USER_B SCORE
1        6      0.2
1        7      0.1
1        10     0.15
2        6      0.2
2        9      0.12
3        8      0.15
3        9      0.3

the USER_A is 1:3 and the USER_B is 6:10. Now I need to output the USER_A with the ranking of USER_B by their SCORE:

USER_A      ranking of USER_B
1  3  1  2  #the ranking of USER_B 6,7,10(which belong to USER_A 1)
2  2  1     #the ranking of USER_B 6,9(which belong to USER_A 2)
3  1  2     #the ranking of USER_B 8,9(which belong to USER_A 3)

in fact, I just need to output the ranking:

3 1 2
2 1
1 2

it is upset because the length of each row is different! I could not store them in a matrix and then output them.

Could anyone help me solve this problem? Thanks very much!

解决方案

df <- read.table(con <- textConnection("USER_A USER_B SCORE
1        6      0.2
1        7      0.1
1        10     0.15
2        6      0.2
2        9      0.12
3        8      0.15
3        9      0.3
"), header = TRUE)
close(con)

One way is to split the data:

sdf <- with(df, split(SCORE, f = USER_A))
lapply(sdf, rank)

The last line gives:

> lapply(sdf, rank)
$`1`
[1] 3 1 2

$`2`
[1] 2 1

$`3`
[1] 1 2

An alternative is to use aggregate() as in:

aggregate(SCORE ~ USER_A, data = df, rank)

Which returns:

> (foo <- aggregate(SCORE ~ USER_A, data = df, rank))
  USER_A   SCORE
1      1 3, 1, 2
2      2    2, 1
3      3    1, 2

But the output is a bit different here, now we have a data frame, with the second component SCORE being a list, just like the lapply() version outputted:

> str(foo)
'data.frame':   3 obs. of  2 variables:
 $ USER_A: int  1 2 3
 $ SCORE :List of 3
  ..$ 0: num  3 1 2
  ..$ 1: num  2 1
  ..$ 2: num  1 2
> foo$SCORE
$`0`
[1] 3 1 2

$`1`
[1] 2 1

$`2`
[1] 1 2

这篇关于如何基于另一列定义的组来计算一列的排名？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何基于另一列定义的组来计算一列的排名？ [英] How to calculate ranking of one column based on groups defined by another column?

问题描述

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

如何基于另一列定义的组来计算一列的排名？ [英] How to calculate ranking of one column based on groups defined by another column?

问题描述

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭