按组获取同比变化百分比 [英] Obtaining year-on-year percentage change by group

查看:74
本文介绍了按组获取同比变化百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用与提取物相对应的数据集:

  set.seed(1)
df<- data.frame(indicator = runif(n = 100),cohort = letters [1:4],
year = rep(1976:2000,each = 4))

我想为每个队列 c>在数据集中表示。我尝试使用下面的代码(根据此讨论,


除列外,对原始 data.frame 没有其他更改

解决方案

我不确定我是否正确理解您想要的输出是什么样子,但是那是您追求的?

  library(dplyr)
df2<-df%>%
group_by(队列)%&%;%
排列(年)%&%;%
mutate(pct.chg =(指标-滞后(指标))/滞后(指标))

如果您希望将百分比设置为0-100的0-1,将 100 *()添加到最后一行,因此 mutate(pct.chg = 100 *((indicator-lag(indicator ))/ lag(indicator))。结果如下所示:

 指标同类群组年份pct.chg 
1 0.2655087 a 1976 NA
2 0.2016819 a 1977 -24.039416
3 0.6291140 a 1978 211.933767
4 0.6870228 a 1979 9.204818
5 0.7176185 a 1980 4.453369
6 0.9347052 a 1981 30.250993


I'm working with a data set corresponding to the extract:

set.seed(1)
df <- data.frame(indicator=runif(n = 100),cohort=letters[1:4],
                 year=rep(1976:2000, each=4))

I would like to generate a variable with percentage year-on-year change for each cohort represented in the data set. I have tried to use the code below (from this discussion):

df$ind_per_chng <- transform(new.col=c(NA,indicator[-1]/indicator[-nrow(df)]-1))

but I'm interested in making it work within each subgroup and generating only one extra column with percentage change instead of set of columns that are presently created:

> head(df)
  indicator cohort year ind_per_chng.indicator ind_per_chng.cohort ind_per_chng.year
1 0.2655087      a 1976              0.2655087                   a              1976
2 0.3721239      b 1976              0.3721239                   b              1976
3 0.5728534      c 1976              0.5728534                   c              1976
4 0.9082078      d 1976              0.9082078                   d              1976
5 0.2016819      a 1977              0.2016819                   a              1977
6 0.8983897      b 1977              0.8983897                   b              1977
  ind_per_chng.new.col
1                   NA
2            0.4015509
3            0.5394157
4            0.5854106
5           -0.7779342
6            3.4544877


Edit

To answer the useful comments, the format of the output should correspond to the table below:

There are no other changes to original data.frame with exception of the column that provides value for the percentage change for the selected variable for each cohort across years.

解决方案

I'm not sure I'm correctly understanding what you want the output to look like, but is that what you're after?

library(dplyr)
df2 <- df%>%
    group_by(cohort) %>%
    arrange(year) %>%
    mutate(pct.chg = (indicator - lag(indicator))/lag(indicator))

If you want your percentages on a 0-100 scale instead of 0-1, add 100 * () to that last line, so mutate(pct.chg = 100 * ((indicator - lag(indicator))/lag(indicator))). Here's what the result looks like:

  indicator cohort year    pct.chg
1 0.2655087      a 1976         NA
2 0.2016819      a 1977 -24.039416
3 0.6291140      a 1978 211.933767
4 0.6870228      a 1979   9.204818
5 0.7176185      a 1980   4.453369
6 0.9347052      a 1981  30.250993

这篇关于按组获取同比变化百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆