在 R 中绘制预测概率和置信区间 [英] Plot predicted probabilities and confidence intervals in R

查看:148
本文介绍了在 R 中绘制预测概率和置信区间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我做的图,我想要图的置信区间,上下​​两部分.到目前为止,我已经生成了上限和下限,但我对包含置信区间的图有问题.

This is a plot I did, I want the confidence intervals for the plot, both upper and lower. I have come so far that I have produced both the upper and lower range but I have problems with the plot that includes the confidence interval.

这是我的几行数据,gdk 是我的二元响应,第二个变量是年龄

Here are a few lines of my data, gdk is my binary response and the second variable is the age

               gdk    age prog      calender
29            FALSE    59 NASTK       11
30            FALSE    59 NASTK       10
91             TRUE    49 NMATK        9
129            TRUE    47 NFYSK        8
227           FALSE    46 LARAA       13
244            TRUE    44 LARAA       11
256            TRUE    41 LARAA        9
311           FALSE    38 NMATK        7
323           FALSE    42 NSFYY       11
393            TRUE    40 LARAA       11
449           FALSE    37 NSFYY        9
450           FALSE    38 NSFYY       10

这是我的第一个情节的代码:

This is the code for my first plot:

prop<-numeric()
for (i in 18:60){prop[i-17]<-mean(both$gdk[both$age==i],na.rm=TRUE)}

mod.red.fin<-glm(respons ~prog+age+calender, family=binomial,data=both)
newdata<-data.frame(prog="NMATK",calender=7, age=18:60)
plot(18:60, predict(mod.red.fin, newdata, type="respons"))

为了增强信心,我使用了以下代码:

to bring up my confidence, I used the code:

newdata<-data.frame(prog="NMATK",calender=7, age=18:60)
newdata2<-cbind(newdata, predict(mod.red.fin, newdata, type="link", se=TRUE))
newdata2<-within(newdata2, {PredictedProb<-plogis(fit)
                            LL<-plogis(fit-(1.96*se.fit))
                            UL<-plogis(fit+(1.96*se.fit))})

head(newdata2)
   prog    calender age    fit    se.fit  residual.scale        UL        LL   PredictedProb
1 NMATK        7    18 1.637162 0.2128354              1 0.8863833 0.7720644     0.8371484
2 NMATK        7    19 1.569661 0.2072370              1 0.8782376 0.7619639     0.8277353
3 NMATK        7    20 1.502160 0.2032196              1 0.8699448 0.7509808     0.8178965
4 NMATK        7    21 1.434660 0.2008779              1 0.8615687 0.7390311     0.8076263
5 NMATK        7    22 1.367159 0.2002708              1 0.8531708 0.7260410     0.7969207
6 NMATK        7    23 1.299658 0.2014139              1 0.8448057 0.7119527     0.7857774

然后我如何绘制置信区间?需要代码方面的帮助.在 library(ggplot2) 上检查了一下,但没有想出任何东西.

how do I then plot the confidence interval? Need help with the code. checked a bit on the library(ggplot2) but did not come up with anything.

推荐答案

如果您想使用 ggplot(可能是创建所需绘图的最简单方法),请使用 stat_smooth() 几何.

If you want to use ggplot (probably the easiest way to create your desired plots), use the stat_smooth() geom.

但是,您对所需的情节有疑问.使用 ggplot 一次只能绘制 1 个 x 变量.话虽如此,下面是一些示例代码,可以让您说明:

However, you have a problem with your desired plot. You can only have 1 x variable plotted at a time with ggplot. That being said, here's some example code that should get you stated:

d = read.table(header = TRUE, text =          
"              gdk    age prog      calender
             FALSE    59 NASTK       11
             FALSE    59 NASTK       10
              TRUE    49 NMATK        9
              TRUE    47 NFYSK        8
             FALSE    46 LARAA       13
              TRUE    44 LARAA       11
              TRUE    41 LARAA        9
             FALSE    38 NMATK        7
             FALSE    42 NSFYY       11
              TRUE    40 LARAA       11
             FALSE    37 NSFYY        9
             FALSE    38 NSFYY       10")
## Convert gkk from T/F to 1/0
d$gdk2 <- as.numeric(d$gdk)
library(ggplot2)

plot1 <- ggplot(data = d, aes(x = age, y = gdk2)) + 
            stat_smooth(method = 'glm', family = 'binomial') +
            theme_bw()

ggsave('plot1.jpg', plot1, width = 6, height = 4)

这给了你这个情节:

plot2 <- ggplot(data = d, aes(x = calender, y = gdk2)) +stat_smooth(method = 'glm', family = 'binomial') +主题_bw()

plot2 <- ggplot(data = d, aes(x = calender, y = gdk2)) + stat_smooth(method = 'glm', family = 'binomial') + theme_bw()

ggsave('plot2.jpg', plot2, width = 6, height = 4)   

这给了你这个数字.

顺便说一句,我知道 ggplot2 很难学.我建议查看此页面了解更多信息.此外,第一个 Google 对confidence ggplot2"的点击是官方的 ggplot2 文档 用于绘制置信区间.

BTW, I know ggplot2 can be hard to learn. I would suggest checking out this page for more information. Also, the first Google hit for "confidence ggplot2" was the offical ggplot2 documentation for plotting confidence intervals.

这篇关于在 R 中绘制预测概率和置信区间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆