如何忽略cor.test:“没有足够的有限观察值”并继续使用tidyverse和ggplot2(ggpmisc) [英] How to ignore cor.test:“not enough finite observations” and continue, when using tidyverse and ggplot2 (ggpmisc)

查看:3003
本文介绍了如何忽略cor.test:“没有足够的有限观察值”并继续使用tidyverse和ggplot2(ggpmisc)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下工作玩具的例子:

  trunctiris<  -  iris [1:102,] 
分析< - trunctiris%>%
group_by(种类)%>%
nest()%>%
mutate(model = map(data,〜lm(Sepal。 )),
cor = map(data,〜tidy(cor.test(.x $ Sepal.Length,.x $ Sepal.Width),3)))

stats< - analysis%>%
unnest(cor)

ggplot(trunctiris,aes(x = Sepal.Length,y = Sepal.Width)) +
geom_point(shape = 21)+
geom_text(data = stats,aes(label = sprintf(r =%s,round(estimate,3)),x = 7,y = 4 ))+
geom_text(data = stats,aes(label = sprintf(p =%s,round(p.value,3)),x = 7,y = 3.8))+
geom_smooth(method =lm,formula = y〜x)+
stat_poly_eq(aes(label = paste(.. eq.label ..,..rr.label ..,sep =~~~~ )),
formula = y〜x,
parse = TRUE)+
facet_wrap(〜物种)

代码是在另一个问题中提供。但是,我无法使其与我的数据一起工作。问题是我有一些(不是全部)组的观察值少于3,因此,在分析部分R中返回:

mutate_impl中的错误(.data,dots):没有足够的有限观察值

这与组中没有足够的观测值有关(在这种情况下为virginica)。我想解决这个问题,我尝试了'尝试(如果nrow(data)> = 2)'或类似的..如下所示:

 (种类)%>%
nest()%>%mutate(model = map(data,〜lm(Sepal .Length〜Sepal.Width,data =。)),
cor = if_else(nrow(data)<= 2,warning(必须至少有3行数据),
(map (data,〜tidy(cor.test(.x $ Sepal.Length,.x $ Sepal.Width),3)))))

返回:

mutate_impl(.data,dots)中的错误:没有足够的有限观察值
另外:警告消息:
在if_else中(nrow(list)(list(Sepal.Length = c(5.1,4.9,4.7,4.6,5,:
)必须至少有3行数据



有人知道一个简单的方法来解决这个问题吗?我想跳过这个有问题的小组并继续前进。

很多感谢和抱歉,我的基本R技能。

purrr :: possible 当您< map ping时可轻松防范错误。在这种情况下,我们需要将调用封装到 tidy(cor.test(... 可能)并返回一个空的data.frame如果发生错误

 分析<  -  trunctiris%>%
group_by(物种) %>%
nest()%>%
mutate(model = map(data,〜lm(Sepal.Length〜Sepal.Width,data =。)),
cor = map(数据,可能(〜tidy(cor.test(.x $ Sepal.Length,.x $ Sepal.Width),3),
otherwise = data.frame())))




 #A tibble:3 x 4 
物种数据模型cor
< fctr>< list>< list>< list>
1 setosa< tibble [50×4]>< S3:lm>< data 1帧8帧> 1帧8帧> 1帧8帧> 1帧8帧> 1帧8帧> 1帧8帧> 1帧8帧; tibble [2×4]>< S3:lm>< data.frame [0×0]>#< - Not这里是空的df


其中变为:

  unnest(分析)



< blockquote>

 #A tibble:2 x 9 
种类估计统计值p.value parameter conf.low conf.high
< DBL> < DBL> < DBL> < INT> < DBL> < DBL>
1 setosa 0.7425467 7.680738 6.709843e-10 48 0.5851391 0.8460314
2 versicolor 0.5259107 4.283887 8.771860e-05 48 0.2900175 0.7015599
#...有2个变量:method< fctr>,alternative < FCTR>


所以发生错误的组已经成功地从最终结果。


I have the following working-toy example:

trunctiris <- iris [1:102,] 
analysis <- trunctiris %>%
  group_by(Species) %>%
  nest() %>%
  mutate(model = map(data, ~lm(Sepal.Length ~ Sepal.Width, data = .)),
         cor = map(data, ~tidy(cor.test(.x$Sepal.Length, .x$Sepal.Width), 3)))

stats <- analysis %>%
  unnest(cor)

ggplot(trunctiris, aes(x = Sepal.Length, y = Sepal.Width)) +
  geom_point(shape = 21) +
  geom_text(data = stats, aes(label = sprintf("r = %s", round(estimate, 3)), x = 7, y = 4)) +
  geom_text(data = stats, aes(label = sprintf("p = %s", round(p.value, 3)),  x = 7, y = 3.8)) +
  geom_smooth(method = "lm", formula = y ~ x) +
  stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~~")),
               formula = y ~ x,
               parse = TRUE) +
  facet_wrap(~Species)

The code was provided in another question. However, I haven't been able to make it work with my data. The problem is that I have some (not all) groups that have a less than 3 observations, and so, in the "analysis" part R returns:

Error in mutate_impl(.data, dots) : not enough finite observations

which is in relation to the fact that there are not enough observations in the group (in this case: virginica). I want to get around this, I've tried 'try(if nrow(data) >= 2)' or similar.. like the following:

analysis <- iris %>% 
group_by(Species) %>% 
nest() %>% mutate(model = map(data, ~lm (Sepal.Length ~ Sepal.Width, data = .)), 
    cor = if_else( nrow(data) <= 2 , warning ("Must have at least 3 rows of data"), 
        (map(data, ~tidy(cor.test(.x$Sepal.Length, .x$Sepal.Width), 3)))))

which returns:

Error in mutate_impl(.data, dots) : not enough finite observations In addition: Warning message: In if_else(nrow(list(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, : Must have at least 3 rows of data

Does anyone know an easy way to get around this? I'd like to skip the problematic group and keep on going.

Many thanks and sorry for my very basic R skills.

解决方案

purrr::safely or purrr::possibly allow for easy guarding against errors when you are mapping. In this case, we need to wrap the call to tidy(cor.test(... in possibly and return an empty data.frame if an error occurs

analysis <- trunctiris %>%
  group_by(Species) %>%
  nest() %>%
  mutate(model = map(data, ~lm(Sepal.Length ~ Sepal.Width, data = .)),
         cor = map(data, possibly(~tidy(cor.test(.x$Sepal.Length, .x$Sepal.Width), 3),
                                  otherwise = data.frame())))

# A tibble: 3 × 4
     Species              data    model                  cor
      <fctr>            <list>   <list>               <list>
1     setosa <tibble [50 × 4]> <S3: lm> <data.frame [1 × 8]>
2 versicolor <tibble [50 × 4]> <S3: lm> <data.frame [1 × 8]>
3  virginica  <tibble [2 × 4]> <S3: lm> <data.frame [0 × 0]> #<- Note the empty df here

Which becomes:

unnest(analysis)

# A tibble: 2 × 9
     Species  estimate statistic      p.value parameter  conf.low conf.high
      <fctr>     <dbl>     <dbl>        <dbl>     <int>     <dbl>     <dbl>
1     setosa 0.7425467  7.680738 6.709843e-10        48 0.5851391 0.8460314
2 versicolor 0.5259107  4.283887 8.771860e-05        48 0.2900175 0.7015599
# ... with 2 more variables: method <fctr>, alternative <fctr>

And so the group that gave an error is sucessfully removed from the end result.

这篇关于如何忽略cor.test:“没有足够的有限观察值”并继续使用tidyverse和ggplot2(ggpmisc)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆