R中使用purrr和dplyr（列表列工作流）的函数中的if语句 [英] if-statement in a function using purrr and dplyr (List Column Workflow) in R

查看：89 发布时间：2020/10/26 4:59:01 r if-statement dplyr purrr rlang

本文介绍了R中使用purrr和dplyr（列表列工作流）的函数中的if语句的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试估算两家医院A和B的均值差异。每家医院都有不同的组，在模拟数据集中，我给它们分配了组1和2。那就是我想测试第1组和第2组中医院A和B之间的均值差异，此外我还有多个变量（例如value1和value2）。因此，我必须在第1组和第2组的A医院和B医院之间测试value1。即使我在最后的调用中指定method = 1，我还是得到了第三个方法（其他部分）。我正在使用推断包进行引导（tidyverse或tidymodels的一部分）。

I am trying to estimate differences in means for two hospitals, A and B. Every hospital has different "groups" and I have given them group 1 and 2 in the simulated data set. That is I want to test difference in means between hospital A and B within group 1 and group 2 and in addition I have more than one variable (e.g. value1 and value2). So I have to test value1 between hospital A and B within the groups 1 and 2. Even though I specify method=1 in the call at the end I get the third method (the else part). I am using the infer package for the bootstraping (part of tidyverse or tidymodels).

library(tidyverse)
library(lubridate)
library(readxl)
library(infer)
library(stringr)
library(rlang)

set.seed(1)
A <-data.frame(value1=rnorm(n = 1000, mean = 0.8, sd = 0.2), value2= rnorm(n=10 ,mean=1, sd=0.3)) 
A$hosp <- "A"
A$group <- sample(1:2,nrow(A) , replace=T) 

B= data.frame(value1 = rnorm(n=1200, mean =1 , sd = 0.2), value2= rnorm(n=15, mean=1.1, sd=0.4))
B$hosp <- "B"
B$group <- sample(1:2,nrow(B) , replace=T) 

forskel <- bind_rows(A, B) %>% 
  group_by(group) %>% 
  nest()

rm(A, B)

Bellow是我的职责。

Bellow is my function.

bootloop <- function(dataset, procestid, method, reps = 4, alpha = 0.05) { 
  procestid <- enquo(procestid)

  diff_mean <- dataset %>% 
    mutate(diff_means  = map(data, function(.x){.x %>% 
        group_by(hosp) %>% 
        summarise(mean(!!procestid, na.rm=TRUE)) %>% 
        pull() %>% 
        diff() })) %>%
    select(-data)


  bootstrap <- dataset %>% 
    mutate(distribution =map(data, function(.x){ .x %>% 
        specify(as.formula(paste0(quo_name(procestid), "~ hosp")) ) %>% 
        generate(reps = reps, type = "bootstrap") %>%
        calculate(stat = "diff in means", order = c( "A", "B"))} )) %>% 
    inner_join(diff_mean, by="group")  

  if (method==1) {
    bootstrap2 <- bootstrap %>% mutate(Bias_Corrected_KI=map2(distribution, diff_means, function(.x, .y){ .x %>% 
        summarise( l =quantile(.x$stat,pnorm(2*qnorm(sum(.x$stat >= .y)/reps) + qnorm(alpha/2))),
                   u= quantile(.x$stat,pnorm(2*qnorm(sum(.x$stat >= .y)/reps) + qnorm(1-alpha/2)))    )}))  }
  if (method==2) {
    bootstrap2 <- bootstrap %>% mutate(Percentile_KI = map(distribution, function(.x){.x %>% 
        summarize(l = quantile(stat, alpha/2),
                  u = quantile(stat, 1 - alpha/2))}))  }
  else {
    bootstrap2 <- bootstrap %>% mutate(SD_KI =map2(distribution, diff_means, function(.x,.y){.x %>% 
        get_confidence_interval(level = (1 - alpha), type="se", point_estimate = .y)})) 
  }
return(bootstrap2)

}

procestimes <- list("value1", "value2")


a <- map(syms(procestimes), bootloop , dataset=forskel, method=1 ,  reps=1000)
a

即使我在调用中指定method = 1，我也会得到第三种形式的置信区间i在其他语句中。

Even though I specify method=1 in the call I get the third form of confidence interval in the else statement.

[[1]]

# A tibble: 2 x 5
  group data                 distribution         diff_means SD_KI           
  <int> <list>               <list>               <list>     <list>          
1     1 <tibble [1,086 x 3]> <tibble [1,000 x 2]> <dbl [1]>  <tibble [1 x 2]>
2     2 <tibble [1,114 x 3]> <tibble [1,000 x 2]> <dbl [1]>  <tibble [1 x 2]>

[[2]]
# A tibble: 2 x 5
  group data                 distribution         diff_means SD_KI           
  <int> <list>               <list>               <list>     <list>          
1     1 <tibble [1,086 x 3]> <tibble [1,000 x 2]> <dbl [1]>  <tibble [1 x 2]>
2     2 <tibble [1,114 x 3]> <tibble [1,000 x 2]> <dbl [1]>  <tibble [1 x 2]>

推荐答案

我想您忘记嵌套if语句，请尝试

I suppose you forget to nest the if statements, try this:

bootloop <- function(dataset, procestid, method, reps = 4, alpha = 0.05) { 
  procestid <- enquo(procestid)

  diff_mean <- dataset %>%
    mutate(diff_means  = map(data, function(.x){.x %>%
        group_by(hosp) %>%
        summarise(mean(!!procestid, na.rm=TRUE)) %>%
        pull() %>%
        diff() })) %>%
    select(-data)

  bootstrap <- dataset %>%
    mutate(distribution =map(data, function(.x){ .x %>%
        specify(as.formula(paste0(quo_name(procestid), "~ hosp")) ) %>%
        generate(reps = reps, type = "bootstrap") %>%
        calculate(stat = "diff in means", order = c( "A", "B"))} )) %>%
    inner_join(diff_mean, by="group")

  if (method==1) {
    bootstrap2 <- bootstrap %>% mutate(Bias_Corrected_KI=map2(distribution, diff_means, function(.x, .y){ .x %>% 
        summarise( l =quantile(.x$stat,pnorm(2*qnorm(sum(.x$stat >= .y)/reps) + qnorm(alpha/2))),
                   u= quantile(.x$stat,pnorm(2*qnorm(sum(.x$stat >= .y)/reps) + qnorm(1-alpha/2)))    )}))  }
  else {  # here you should open a curly brackets with else, and close it of course
  if (method==2) {
    bootstrap2 <- bootstrap %>% mutate(Percentile_KI = map(distribution, function(.x){.x %>% 
        summarize(l = quantile(stat, alpha/2),
                  u = quantile(stat, 1 - alpha/2))})) }
  else {
    bootstrap2 <- bootstrap %>% mutate(SD_KI =map2(distribution, diff_means, function(.x,.y){.x %>% 
        get_confidence_interval(level = (1 - alpha), type="se", point_estimate = .y)})) 
  }}
  return(bootstrap2)

}

结果如下：

bootloop (forskel, value1, method=1, reps = 4, alpha = 0.05)
# A tibble: 2 x 5
  group data                 distribution     diff_means Bias_Corrected_KI
  <int> <list>               <list>           <list>     <list>           
1     1 <tibble [1,086 x 3]> <tibble [4 x 2]> <dbl [1]>  <tibble [1 x 2]> 
2     2 <tibble [1,114 x 3]> <tibble [4 x 2]> <dbl [1]>  <tibble [1 x 2]> 
> bootloop (forskel, value1, method=2, reps = 4, alpha = 0.05)
# A tibble: 2 x 5
  group data                 distribution     diff_means Percentile_KI   
  <int> <list>               <list>           <list>     <list>          
1     1 <tibble [1,086 x 3]> <tibble [4 x 2]> <dbl [1]>  <tibble [1 x 2]>
2     2 <tibble [1,114 x 3]> <tibble [4 x 2]> <dbl [1]>  <tibble [1 x 2]>
> bootloop (forskel, value1, method=3, reps = 4, alpha = 0.05)
# A tibble: 2 x 5
  group data                 distribution     diff_means SD_KI           
  <int> <list>               <list>           <list>     <list>          
1     1 <tibble [1,086 x 3]> <tibble [4 x 2]> <dbl [1]>  <tibble [1 x 2]>
2     2 <tibble [1,114 x 3]> <tibble [4 x 2]> <dbl [1]>  <tibble [1 x 2]>

这篇关于R中使用purrr和dplyr（列表列工作流）的函数中的if语句的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R中使用purrr和dplyr（列表列工作流）的函数中的if语句 [英] if-statement in a function using purrr and dplyr (List Column Workflow) in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R中使用purrr和dplyr（列表列工作流）的函数中的if语句 [英] if-statement in a function using purrr and dplyr (List Column Workflow) in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭