用dplyr计算95％-CI的长度 [英] Calculating length of 95%-CI using dplyr

查看：218 发布时间：2018/4/24 21:34:56 r ggplot2 linechart confidence-interval trend

本文介绍了用dplyr计算95％-CI的长度的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我最后一次询问如何计算每个测量场合（一周）的变量（procras）的平均得分，这个变量对于多个受访者反复测量。所以我的（简化的）长格式数据集看起来像下面这样（这里有两个学生，5个时间点，没有分组变量）：

  studentID week procras 
 1 0 1.4 
 1 6 1.2 
 1 16 1.6 
 1 28 NA 
 1 40 3.8 
 2 0 1.4 
 2 6 1.8 
 2 16 2.0 
 2 28 2.5 
 2 40 2.8

使用dplyr我会得到每个度量场合的平均分数

  mean_data < -  group_by（DataRlong，week ）％>％汇总（procras = mean（procras，na.rm = TRUE））

例如：

 来源：local data frame [5 x 2] 
 occ procras 
（dbl ）（dbl）
 1 0 1.993141 
 2 6 2.124020 
 3 16 2.251548 
 4 28 2.469658 
 5 40 2.617903

使用ggplot2我现在可以绘制随时间的平均变化，并且通过轻松调整dplyr的group_data（），我也可以获得每个子组的意味着例如，男性和女性每次平均得分）。
现在我想在mean_data表中添加一列，其中包括95％-CIs每个场合平均得分的长度。

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_（ggplot2）/ < a>解释了如何获取和绘制配置项，但是，只要我想为任何子群执行此操作，这种方法似乎就会出现问题，对吗？那么有没有办法让dplyr自动在mean_data中包含CI（基于组的大小等）？
之后，应该相当容易地将新值作为CI映射到我希望的图中。
谢谢。

解决方案

您可以使用 mutate 在中总结一些额外的函数

  library（dplyr）
 mtcars％>％
 group_by（vs）％>％
汇总（mean.mpg =平均值（mpg，na.rm = TRUE），
 sd.mpg = sd （mpg，na.rm = TRUE），
 n.mpg = n（））％>％
 mutate（se.mpg = sd.mpg / sqrt（n.mpg），
 lower.ci.mpg = mean.mpg-qt（1-（0.05 / 2），n.mpg-1）* se.mpg，
 upper.ci.mpg = mean.mpg + qt（1  - （0.05 / 2），n.mpg  -  1）* se.mpg）
 
＃>来源：本地数据框[2 x 7] 
＃> 
＃> vs mean.mpg sd.mpg n.mpg se.mpg lower.ci.mpg upper.ci.mpg 
＃> （dbl）（dbl）（dbl）（int）（dbl）（dbl）（dbl）
＃> 1 0 16.61667 3.860699 18 0.9099756 14.69679 18.53655 
＃> 2 1 24.55714 5.378978 14 1.4375924 21.45141 27.66287

Last time I asked how it was possible to calculate the average score per measurement occasion (week) for a variable (procras) that has been measured repeatedly for multiple respondents. So my (simplified) dataset in long format looks for example like the following (here two students, and 5 time points, no grouping variable):

studentID  week   procras
   1        0     1.4
   1        6     1.2
   1        16    1.6
   1        28    NA
   1        40    3.8
   2        0     1.4
   2        6     1.8
   2        16    2.0
   2        28    2.5
   2        40    2.8

Using dplyr I would get the average score per measurement occasion

mean_data <- group_by(DataRlong, week)%>% summarise(procras = mean(procras, na.rm = TRUE))

Looking like this e.g.:

Source: local data frame [5 x 2]
        occ  procras
      (dbl)    (dbl)
    1     0 1.993141
    2     6 2.124020
    3    16 2.251548
    4    28 2.469658
    5    40 2.617903

With ggplot2 I could now plot the average change over time, and by easily adjusting the group_data() of dplyr I could also get means per sub groups (for instance, the average score per occasion for men and women). Now I would like to add a column to the mean_data table which includes the length for the 95%-CIs around the average score per occasion.

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/ explains how to get and plot CIs, but this approach seems to become problematic as soon as I wanted to do this for any subgroup, right? So is there a way to let dplyr also include the CI (based on group size, ect.) automatically in the mean_data? After that it should be fairly easy to plot the new values as CIs into the graphs I hope. Thank you.
解决方案
You could do it manually using mutate a few extra functions in summarise
library(dplyr) mtcars %>% group_by(vs) %>% summarise(mean.mpg = mean(mpg, na.rm = TRUE), sd.mpg = sd(mpg, na.rm = TRUE), n.mpg = n()) %>% mutate(se.mpg = sd.mpg / sqrt(n.mpg), lower.ci.mpg = mean.mpg - qt(1 - (0.05 / 2), n.mpg - 1) * se.mpg, upper.ci.mpg = mean.mpg + qt(1 - (0.05 / 2), n.mpg - 1) * se.mpg) #> Source: local data frame [2 x 7] #> #> vs mean.mpg sd.mpg n.mpg se.mpg lower.ci.mpg upper.ci.mpg #> (dbl) (dbl) (dbl) (int) (dbl) (dbl) (dbl) #> 1 0 16.61667 3.860699 18 0.9099756 14.69679 18.53655 #> 2 1 24.55714 5.378978 14 1.4375924 21.45141 27.66287

这篇关于用dplyr计算95％-CI的长度的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用dplyr计算95％-CI的长度 [英] Calculating length of 95%-CI using dplyr

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

用dplyr计算95％-CI的长度 [英] Calculating length of 95%-CI using dplyr

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭