带FACET_WRAP的R自举回归 [英] R bootstrap regression with facet_wrap

查看:21
本文介绍了带FACET_WRAP的R自举回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一直在练习mtars数据集。

我使用线性模型创建了此图表。

library(tidyverse)
library(tidymodels)

ggplot(data = mtcars, aes(x = wt, y = mpg)) + 
  geom_point() + geom_smooth(method = 'lm')

然后,我将数据帧转换为长数据帧,以便可以尝试使用facet_WRAP。

mtcars_long_numeric <- mtcars %>%
  select(mpg, disp, hp, drat, wt, qsec) 

mtcars_long_numeric <- pivot_longer(mtcars_long_numeric, names_to = 'names', values_to = 'values', 2:6)

现在,我想了解一些有关geom_mooth上的标准错误的信息,并看看是否可以使用引导来生成置信区间。

我在link的RStudio整洁模型文档中找到了此代码。

boots <- bootstraps(mtcars, times = 250, apparent = TRUE)
boots

fit_nls_on_bootstrap <- function(split) {
    lm(mpg ~ wt, analysis(split))
}

boot_models <-
  boots %>% 
  dplyr::mutate(model = map(splits, fit_nls_on_bootstrap),
         coef_info = map(model, tidy))

boot_coefs <- 
  boot_models %>% 
  unnest(coef_info)

percentile_intervals <- int_pctl(boot_models, coef_info)
percentile_intervals

ggplot(boot_coefs, aes(estimate)) +
  geom_histogram(bins = 30) +
  facet_wrap( ~ term, scales = "free") +
  labs(title="", subtitle = "mpg ~ wt - Linear Regression Bootstrap Resampling") +
  theme(plot.title = element_text(hjust = 0.5, face = "bold")) +
  theme(plot.subtitle = element_text(hjust = 0.5)) +
  labs(caption = "95% Confidence Interval Parameter Estimates, Intercept + Estimate") +
  geom_vline(aes(xintercept = .lower), data = percentile_intervals, col = "blue") +
  geom_vline(aes(xintercept = .upper), data = percentile_intervals, col = "blue")

boot_aug <- 
  boot_models %>% 
  sample_n(50) %>% 
  mutate(augmented = map(model, augment)) %>% 
  unnest(augmented)

ggplot(boot_aug, aes(wt, mpg)) +
  geom_line(aes(y = .fitted, group = id), alpha = .3, col = "blue") +
  geom_point(alpha = 0.005) +
  # ylim(5,25) +
    labs(title="", subtitle = "mpg ~ wt 
 Linear Regression Bootstrap Resampling") +
  theme(plot.title = element_text(hjust = 0.5, face = "bold")) +
  theme(plot.subtitle = element_text(hjust = 0.5)) +
  labs(caption = "coefficient stability testing") 
https://i.stack.imgur.com/75DQc.png"Rel="无推荐人">https://i.stack.imgur.com/75DQc.png"rrc="mtcars%>;% GROUP_BY(圆柱体)%>;% 汇总(x=分位数(mpg,c(0.25,0.75)),y=iqr(Mpg))%>;% 过滤(cyl==8)%>;% 突变(z=x-y)"/>

有没有某种方法可以将引导回归也作为facet_WRAP呢?我尝试将长数据帧放入bootstraps函数。 。

boots <- bootstraps(mtcars_long_numeric, times = 250, apparent = TRUE)
boots

fit_nls_on_bootstrap <- function(split) {
    group_by(names) %>%
    lm(mpg ~ values, analysis(split))
}

但这不起作用。

或者我尝试在此处添加GROUP_BY:

boot_models <-
  boots %>% 
  group_by(names) %>%
  dplyr::mutate(model = map(splits, fit_nls_on_bootstrap),
         coef_info = map(model, tidy))

这不起作用,因为BOOT$NAMES不存在。我也无法在BOOT_AUG中添加分组作为FEET_WRAP,因为那里不存在名称。

ggplot(boot_aug, aes(values, mpg)) +
  geom_line(aes(y = .fitted, group = id), alpha = .3, col = "blue") +
    facet_wrap(~names) +
  geom_point(alpha = 0.005) +
  # ylim(5,25) +
    labs(title="", subtitle = "mpg ~ wt 
 Linear Regression Bootstrap Resampling") +
  theme(plot.title = element_text(hjust = 0.5, face = "bold")) +
  theme(plot.subtitle = element_text(hjust = 0.5)) +
  labs(caption = "coefficient stability testing") 

此外,我还了解到我也不能通过~id进行facet_rapp。我最终得到了一个看起来像这样的图表,它非常难以阅读!我真正感兴趣的是对"wt"、"disp"、"qsec"之类的内容使用facet_rapp,而不是在每个引导本身上使用。

这是其中一种情况,我使用的代码稍微超出了我的能力范围-如果您有任何建议,我将不胜感激。

这是我希望获得预期输出的图像。除了标准误差的阴影区域之外,我希望看到或多或少占据相同区域的自举回归模型。

推荐答案

可能是这样的:

library(data.table)
mt = as.data.table(mtcars_long_numeric)

# helper function to return lm coefficients as a list
lm_coeffs = function(x, y) {
  coeffs = as.list(coefficients(lm(y~x)))
  names(coeffs) = c('C', "m")
  coeffs
}

# generate bootstrap samples of slope ('m') and intercept ('C')
nboot = 100
mtboot = lapply (seq_len(nboot), function(i) 
  mt[sample(.N,.N,TRUE), lm_coeffs(values, mpg), by=names])
mtboot = rbindlist(mtboot)

# and plot:    
ggplot(mt, aes(values, mpg)) +
  geom_abline(aes(intercept=C, slope=m), data = mtboot, size=0.3, alpha=0.3, color='forestgreen') +
  stat_smooth(method = "lm", colour = "red", geom = "ribbon", fill = NA, size=0.5, linetype='dashed') +
  geom_point() +
  facet_wrap(~names, scales = 'free_x')

附注:对于那些更喜欢dplyr(不是我)的人,下面是转换为该格式的相同逻辑:

lm_coeffs = function(x, y) {
  coeffs = coefficients(lm(y~x))
  tibble(C = coeffs[1], m=coeffs[2])
}

mtboot = lapply (seq_len(nboot), function(i) 
  mtcars_long_numeric %>%
    group_by(names) %>%
    slice_sample(prop=1, replace=TRUE) %>% 
    summarise(tibble(lm_coeffs2(values, mpg))))
mtboot = do.call(rbind, mtboot)

这篇关于带FACET_WRAP的R自举回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆