用时间序列中的天数替换日期 [英] Substituting dates with number of days in time series

查看:35
本文介绍了用时间序列中的天数替换日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下关于学生在真正考试前的几次预测试分数的数据.

I have following data on student scores on several pretests before their true exam.

a<-(c("2013-02-25","2013-03-13","2013-04-24","2013-05-12","2013-07-12","2013-08-11","actual_exam_date"))
b<-c(300,230,400,NA,NA,NA,"2013-04-30")
c<-c(NA,260,410,420,NA,NA,"2013-05-30")
d<-c(300,230,400,NA,370,390,"2013-08-30")
df<-as.data.frame(rbind(b,c,d))
colnames(df)<-a
rownames(df)<-(c("student 1","student 2","student 3"))

实际的数据表要大得多.由于日期变化很大,并且预测试和考试之间的时间相对相似,我宁愿将真实日期转换为考试前的天数,以便它们是新的列名,而不是日期.我知道这将合并一些可以的列.我怎么能做到这一点?

The actual datasheet is much larger. Since the dates vary so much, and the timing between the pretests and to the exam are relatively similar, I would rather convert the true dates into the number of days before the exam, so that they are the new column names, not dates. I understand that this will merge some of the columns which is OK. How would I be able to do that?

推荐答案

这是 reshape2 的另一个很好的用例,因为您想使用长格式进行绘图.例如:

This is another good use case for reshape2, because you want to go to long form for plotting. For example:

# you are going to need the student id as a field
df$student_id <- row.names(df)

library('reshape2')

df2 <- melt(df, id.vars = c('student_id','actual_exam_date'),
                variable.name = 'pretest_date',
                value.name = 'pretest_score')

# drop empty observations
df2 <- df2[!is.na(df2$pretest_score),]

# these need to be dates
df2$actual_exam_date <- as.Date(df2$actual_exam_date)
df2$pretest_date <- as.Date(df2$pretest_date)

# date difference
df2$days_before_exam <- as.integer(df2$actual_exam_date - df2$pretest_date)

# scores need to be numeric
df2$pretest_score <- as.numeric(df2$pretest_score)

# now you can make some plots
library('ggplot2')

ggplot(df2, aes(x = days_before_exam, y = pretest_score, col=student_id) ) + 
  geom_line(lwd=1) + scale_x_reverse() + 
  geom_vline(xintercept = 0, linetype = 'dashed', lwd = 1) +
  ggtitle('Pretest Performance') + xlab('Days Before Exam') + ylab('Pretest Score') 

这篇关于用时间序列中的天数替换日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆