按组获取同比变化百分比 [英] Obtaining year-on-year percentage change by group
问题描述
我正在使用与提取物相对应的数据集:
set.seed(1)
df<- data.frame(indicator = runif(n = 100),cohort = letters [1:4],
year = rep(1976:2000,each = 4))
我想为每个队列$ c $生成一个变量,它们的百分比同比变化 c>在数据集中表示。我尝试使用下面的代码(根据此讨论,
除列外,对原始 data.frame
没有其他更改
我不确定我是否正确理解您想要的输出是什么样子,但是那是您追求的?
library(dplyr)
df2<-df%>%
group_by(队列)%&%;%
排列(年)%&%;%
mutate(pct.chg =(指标-滞后(指标))/滞后(指标))
如果您希望将百分比设置为0-100的0-1,将 100 *()
添加到最后一行,因此 mutate(pct.chg = 100 *((indicator-lag(indicator ))/ lag(indicator))
。结果如下所示:
指标同类群组年份pct.chg
1 0.2655087 a 1976 NA
2 0.2016819 a 1977 -24.039416
3 0.6291140 a 1978 211.933767
4 0.6870228 a 1979 9.204818
5 0.7176185 a 1980 4.453369
6 0.9347052 a 1981 30.250993
I'm working with a data set corresponding to the extract:
set.seed(1)
df <- data.frame(indicator=runif(n = 100),cohort=letters[1:4],
year=rep(1976:2000, each=4))
I would like to generate a variable with percentage year-on-year change for each cohort
represented in the data set. I have tried to use the code below (from this discussion):
df$ind_per_chng <- transform(new.col=c(NA,indicator[-1]/indicator[-nrow(df)]-1))
but I'm interested in making it work within each subgroup and generating only one extra column with percentage change instead of set of columns that are presently created:
> head(df)
indicator cohort year ind_per_chng.indicator ind_per_chng.cohort ind_per_chng.year
1 0.2655087 a 1976 0.2655087 a 1976
2 0.3721239 b 1976 0.3721239 b 1976
3 0.5728534 c 1976 0.5728534 c 1976
4 0.9082078 d 1976 0.9082078 d 1976
5 0.2016819 a 1977 0.2016819 a 1977
6 0.8983897 b 1977 0.8983897 b 1977
ind_per_chng.new.col
1 NA
2 0.4015509
3 0.5394157
4 0.5854106
5 -0.7779342
6 3.4544877
Edit
To answer the useful comments, the format of the output should correspond to the table below:
There are no other changes to original data.frame
with exception of the column that provides value for the percentage change for the selected variable for each cohort across years.
I'm not sure I'm correctly understanding what you want the output to look like, but is that what you're after?
library(dplyr)
df2 <- df%>%
group_by(cohort) %>%
arrange(year) %>%
mutate(pct.chg = (indicator - lag(indicator))/lag(indicator))
If you want your percentages on a 0-100 scale instead of 0-1, add 100 * ()
to that last line, so mutate(pct.chg = 100 * ((indicator - lag(indicator))/lag(indicator)))
. Here's what the result looks like:
indicator cohort year pct.chg
1 0.2655087 a 1976 NA
2 0.2016819 a 1977 -24.039416
3 0.6291140 a 1978 211.933767
4 0.6870228 a 1979 9.204818
5 0.7176185 a 1980 4.453369
6 0.9347052 a 1981 30.250993
这篇关于按组获取同比变化百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!