将R中的数据分组以执行功能 [英] Grouping data in R to perform a function

查看:135
本文介绍了将R中的数据分组以执行功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是我的数据示例:

           id   score
1          82   0.50000
2          82   0.39286
3          82   0.56250
4         328   0.50000
5         328   0.67647
6         328   0.93750
7         328   0.91667

我想为每个ID制作一列移动平均分数.

I want to make a column of moving average's of scores for each id.

所以我需要通过id对数据进行分组,然后将MA函数应用于该分组数据,然后将输出作为另一列"MA_score"

So I need to somehow group the data by id then apply a MA function to that grouped data and then have the output as another column "MA_score"

我希望我的输出看起来像这样:

I would like my output to look like this:

           id   score    MA_score
1          82   0.50000   NULL
2          82   0.39286   0.xxxx
3          82   0.56250   NULL
4         328   0.50000   NULL
5         328   0.67647   0.yyyy
6         328   0.93750   0.qqqq
7         328   0.91667   NULL

推荐答案

您可以使用zoo包中的split和rollapply作为解决此问题的多种方法之一.请注意,在下面的示例中,我将rollapply函数的宽度设置为1,因此它仅返回每个值.对于大于1的宽度,它将采用该数量的值的平均值.

You could use split and rollapply from the zoo package as one of many ways to approach this. Note that in the example below I set the width of the rollapply function to 1 so it just returns each value. For widths greater than one it will take the mean of that number of values.

require(zoo)
sapply( split( df , df$id) , function(x) rollapply( x , width = 1 , align = 'left' , mean) )
#Note that by setting width = 1 we just return the value
$`82`
     id   score
[1,] 82 0.50000
[2,] 82 0.39286
[3,] 82 0.56250

$`328`
      id   score
[1,] 328 0.50000
[2,] 328 0.67647
[3,] 328 0.93750
[4,] 328 0.91667

如果我们设置width = 3,您将得到:

If we were to set width = 3 you would get:

$`82`
     id   score
[1,] 82 0.48512

$`328`
      id     score
[1,] 328 0.7046567
[2,] 328 0.8435467

或者您可以在base R中使用聚合:

Or you could use aggregate in base R:

aggregate(  score ~ id , data = df , function(x) rollapply( x , width = 1 , align = 'left' , mean)  )
   id                              score
1  82          0.50000, 0.39286, 0.56250
2 328 0.50000, 0.67647, 0.93750, 0.91667

有很多方法可以做到这一点.不过,我会精确定义您的移动平均函数,因为有很多方法可以计算它(例如,请查看TTR:::SMA)

There are quite a few ways to do this. I would precisely define your moving average function though, because there are many ways to calculate it (check out for example TTR:::SMA)

或更简单的使用ave:

within(df, { MA_score <- ave(score, id, FUN=function(x) 
                rollmean(x, k=3, na.pad = TRUE))})

这篇关于将R中的数据分组以执行功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆