使用逻辑条件计算平均值 [英] Calculating the mean using logical condition

查看:117
本文介绍了使用逻辑条件计算平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个赛季的足球数据集,其中一些变量是:player_idweekpoints(一场比赛中每个球员的成绩).

I have a football dataset for a season and some variable are: player_id, week and points (a grade for each player in a match).

因此,每个player_id在我的数据集中都会出现几次.

So, each player_id appears several times in my dataset.

我的目标是计算每位玩家的平均得分,但只计算前几周.

My goal is to calculate the average points for each player, but just to previous weeks.

例如,对于player_id=5445week=10所在的行,我希望数据具有player_id=5445并且周从1到9时的平均值.

For example, to the row where player_id=5445 and week=10, I want the mean when data has player_id=5445 and week is from 1 to 9.

我知道我可以过滤每一行的数据并进行计算.但我希望以一种更聪明/更快的方式来做到这一点...

I know I can do it filtering data for each row and calculating it. But I hope to do it in a smarter/faster way...

我想到了类似的东西

aggregate(mydata$points, FUN=mean, 
          by=list(player_id=mydata$player_id, week<mydata$week))

但是没有用

谢谢!

推荐答案

下面是一些示例数据的解决方案,

Here's a solution along with some sample data,

football_df <- 
  data.frame(player_id = c(1, 2, 3, 4),
             points = as.integer(runif(40, 0, 10)), 
             week = rep(1:10, each = 4))

获得运行平均值:

require(dplyr)
football_df %>% 
      group_by(player_id) %>%    # the group to perform the stat on
      arrange(week) %>%          # order the weeks within each group
      mutate(avg = cummean(points) ) %>% # for each week get the cumulative mean
      mutate(avg = lag(avg) ) %>% # shift cumulative mean back one week
      arrange(player_id) # sort by player_id

这是结果表中的前两名玩家,对于您而言,对于第2周的玩家1,前一周的平均值为7,而在第3周,前一周的平均值为(9 + 7)/2 = 8 ...:

Here's the first two players of the resulting table, for which you can see that for player 1 in week 2, the previous week's average is 7, and in week 3, the previous week's average is (9+7) / 2 = 8 ... :

   player_id points week      avg
1          1      7    1       NA
2          1      9    2 7.000000
3          1      9    3 8.000000
4          1      1    4 8.333333
5          1      4    5 6.500000
6          1      8    6 6.000000
7          1      0    7 6.333333
8          1      2    8 5.428571
9          1      5    9 5.000000
10         1      8   10 5.000000
11         2      6    1       NA
12         2      9    2 6.000000
13         2      5    3 7.500000
14         2      1    4 6.666667
15         2      0    5 5.250000
16         2      9    6 4.200000
17         2      8    7 5.000000
18         2      6    8 5.428571
19         2      6    9 5.500000
20         2      8   10 5.555556

这篇关于使用逻辑条件计算平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆