计算自基准年以来的变化? [英] Calculate change since base year?

查看:49
本文介绍了计算自基准年以来的变化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样的数据集:

I have a dataset that looks something like this:

df1 <- data.frame(id = c(rep("A1",4), rep("A2",4)),
                  time = rep(c(0,2:4), 2),
                  y1 = rnorm(8),
                  y2 = rnorm(8))

对于每个 y 变量,我想计算它们自 time==0 以来的变化.基本上,我想这样做:

For each of the y variables, I want to calculate their change since time==0. Basically, I want to do this:

calc_chage <- function(id, data){
  #y1
  y1_0 <- data$y1[which(data$time==0 & data$id==id)]
  D2y1 <- data$y1[which(data$time==2 & data$id==id)] - y1_0
  D3y1 <- data$y1[which(data$time==3 & data$id==id)] - y1_0
  D4y1 <- data$y1[which(data$time==4 & data$id==id)] - y1_0
  #y2
  y2_0 <- data$y2[which(data$time==0 & data$id==id)]
  D2y2 <- data$y2[which(data$time==2 & data$id==id)] - y2_0
  D3y2 <- data$y2[which(data$time==3 & data$id==id)] - y2_0
  D4y2 <- data$y2[which(data$time==4 & data$id==id)] - y2_0
  #Output
  out <- data.frame(id=id, delta=rep(2:4, 2), 
           outcome=c(rep("y1",3), rep("y2",3)),
           change = c(D2y1, D3y1, D4y1,
                      D2y2, D3y2, D4y2))

}

library(purrr)

changes <- map(.x = unique(df1$id), .f = calc_chage, data=df1) %>% 
  map_df(bind_rows)

我的猜测是有一种更有效的方法可以做到这一点.唉,我想不出来.建议?

My guess is that there is a more efficient way of doing this. Alas, I cannot think of it. Suggestions?

推荐答案

要计算 time == 0 以来的变化,可以使用 cumsum + diff;由于汇总结果的长度不等于1,先将其包裹在一个列表中,然后unnest,并使用gather将结果转换为长格式:

To calculate the change since time == 0, you can use cumsum + diff; Since the length of summarised result is not equal to one, wrap it in a list first, then unnest, and use gather to transform the result to long format:

library(tidyverse)
df1 %>% 
    group_by(id) %>% 
    summarise_all(~ list(cumsum(diff(.)))) %>% 
    unnest() %>% rename(delta = time) %>% 
    gather(outcome, change, y1:y2) %>% 
    arrange(id) -> changes2

changes2
# A tibble: 12 x 4
#       id delta outcome     change
#   <fctr> <dbl>   <chr>      <dbl>
# 1     A1     2      y1  2.2827244
# 2     A1     3      y1  2.2070326
# 3     A1     4      y1  1.9530212
# 4     A1     2      y2 -2.1263046
# 5     A1     3      y2 -0.5430784
# 6     A1     4      y2 -0.3109535
# 7     A2     2      y1 -1.8587070
# 8     A2     3      y1 -1.1399270
# 9     A2     4      y1  1.5667202
#10     A2     2      y2 -2.0047108
#11     A2     3      y2 -3.4414667
#12     A2     4      y2 -1.3662450

<小时>

changes$delta <- as.numeric(changes$delta)
changes$outcome <- as.character(changes$outcome)
all.equal(as.data.frame(changes2), changes)
# [1] TRUE

这篇关于计算自基准年以来的变化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆