长格式数据框架如何计算增长率? [英] How calculate growth rate in long format data frame?

查看:208
本文介绍了长格式数据框架如何计算增长率?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

数据结构如下:

  df<  -  data.frame(Category = c(rep( A,6),rep(B,6)),
Year = rep(2010:2015,2),Value = 1:12)
pre>

我很难在类别中创建增长率列(按年份)。任何人都可以帮助代码创建这样的东西...

 类别年度增长
A 2010 1
A 2011 2 1.000
A 2012 3 0.500
A 2013 4 0.333
A 2014 5 0.250
A 2015 6 0.200
B 2010 7
B 2011 8 0.143
B 2012 9 0.125
B 2013 10 0.111
B 2014 11 0.100
B 2015 12 0.091


解决方案

对于这些问题(如何按类别YYY计算XXX)?总是存在基于 by() data.table()包和 plyr 。我通常更喜欢 plyr ,这往往较慢,但(对我来说)更透明/优雅。

  df<  -  data.frame(Category = c(rep(A,6),rep(B,6)),
Year = rep(2010: 2),Value = 1:12)


库(plyr)
ddply(df,Category,transform,
Growth = c(NA,exp (差异(log(Value))) - 1))

krlmr的是,我使用几何平均技巧(取日志差异,然后取幂),而@krlmr计算显式比率。



数学上, diff(log(Value))正在使用日志的差异,即 log(x [t + 1]) - log(x [t])为所有 t 。当我们取幂时,我们得到这个比率 x [t + 1] / x [t] (因为 exp(log(x [t + 1] )-log(x [t]))= exp(log(x [t + 1]))/ exp(log(x [t]))= x [t + 1] / x [t] $ c>)。 OP想要分数变化,而不是乘法增长率(即 x [t + 1] == x [t] 对应于零的分数变化而不是乘法增长率为1.0),所以我们减去1.



我也在使用 transform()额外的句法糖,以避免创建一个新的匿名函数。


With data structured as follows...

df <- data.frame(Category=c(rep("A",6),rep("B",6)),
      Year=rep(2010:2015,2),Value=1:12)

I'm having a tough time creating a growth rate column (by year) within category. Can anyone help with code to create something like this...

Category Year Value Growth  
    A   2010    1   
    A   2011    2   1.000  
    A   2012    3   0.500  
    A   2013    4   0.333  
    A   2014    5   0.250  
    A   2015    6   0.200  
    B   2010    7     
    B   2011    8   0.143  
    B   2012    9   0.125  
    B   2013    10  0.111  
    B   2014    11  0.100  
    B   2015    12  0.091  

解决方案

For these sorts of questions ("how do I compute XXX by category YYY")? there are always solutions based on by(), the data.table() package, and plyr. I generally prefer plyr, which is often slower, but (to me) more transparent/elegant.

df <- data.frame(Category=c(rep("A",6),rep("B",6)),
  Year=rep(2010:2015,2),Value=1:12)


library(plyr)
ddply(df,"Category",transform,
         Growth=c(NA,exp(diff(log(Value)))-1))

The main difference between this answer and @krlmr's is that I am using a geometric-mean trick (taking differences of logs and then exponentiating) while @krlmr computes an explicit ratio.

Mathematically, diff(log(Value)) is taking the differences of the logs, i.e. log(x[t+1])-log(x[t]) for all t. When we exponentiate that we get the ratio x[t+1]/x[t] (because exp(log(x[t+1])-log(x[t])) = exp(log(x[t+1]))/exp(log(x[t])) = x[t+1]/x[t]). The OP wanted the fractional change rather than the multiplicative growth rate (i.e. x[t+1]==x[t] corresponds to a fractional change of zero rather than a multiplicative growth rate of 1.0), so we subtract 1.

I am also using transform() for a little bit of extra "syntactic sugar", to avoid creating a new anonymous function.

这篇关于长格式数据框架如何计算增长率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆