使用具有置信区间的ggplot绘制时间序列 [英] Plot time series with ggplot with confidence interval

查看:70
本文介绍了使用具有置信区间的ggplot绘制时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有一个带有时间序列的数据表,其中每个时间戳都有多次观察,是否可以直接用均值和间隔绘制该数据集?

If I have a data table with a time series in which every time stamps have multiple observation, is there a direct way to plot that data set with the mean and interval?

例如,创建数据集:

dt <- lapply(seq(1,10),function(x) {
  dt <- data.table(Time = seq(1,100),
                   Value = seq(1,100)* 3 + rnorm(100,5,20))
})

dt <- rbindlist(dt,idcol = 'Run') 

ggplot(dt,aes(Time,Value,group = Run)) +
  geom_line(size = 0.1,alpha = 0.5)

每个时间戳都有多个观察结果.我希望剧情看起来像这样:

Each time stamp has multiple observations. What I want the plot to look like is something like this:

ggplot(dt[,list(Value = mean(Value),
                MaxValue = quantile(Value, 0.9),
                MinValue = quantile(Value, 0.1)),
          list(Time)])+
  aes(x = Time, y = Value,ymin = MinValue,ymax = MaxValue)+
  geom_line()+
  geom_ribbon(alpha = 0.3)

这行得通,但似乎很多行可以简化一些事情.例如,如果我正在做箱线图,则可以通过更简单的ggplot调用来做到这一点:

This works, but seems like a lot of lines for something that should be simpler. For example, if I was doing boxplot, I could do this in a much simpler ggplot call:

ggplot(dt)+
  aes(x = factor(Time), y = Value)+
  geom_boxplot()

谢谢您的帮助!

推荐答案

我们可以通过以下方式使用 stat_summary .

We can use the stat_summary as the following way.

ggplot(dt,aes(Time, Value)) +
  stat_summary(geom = "line", fun.y = mean) +
  stat_summary(geom = "ribbon", fun.data = mean_cl_normal, alpha = 0.3)

如果您仍然想要90%和10%的均值,则需要设计一个返回 y 的函数,数字数据的 ymin ymax

If you still want the mean with 90 and 10 percentile, you need to design a function return the y, ymin, and ymax of your numerical data

mean_cl_quantile <- function(x, q = c(0.1, 0.9), na.rm = TRUE){
  dat <- data.frame(y = mean(x, na.rm = na.rm),
                    ymin = quantile(x, probs = q[1], na.rm = na.rm),
                    ymax = quantile(x, probs = q[2], na.rm = na.rm))
  return(dat)
}

ggplot(dt,aes(Time, Value)) +
  stat_summary(geom = "line", fun.y = mean) +
  stat_summary(geom = "ribbon", fun.data = mean_cl_quantile, alpha = 0.3)

或作为lististaire的评论:

Or as alistaire's comment:

ggplot(dt, aes(Time, Value)) + 
  geom_smooth(stat = 'summary', fun.data = mean_cl_quantile)

这篇关于使用具有置信区间的ggplot绘制时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆