ggstatsplot中的“​​提供了非限定值" [英] `non-finite value supplied` in ggstatsplot

查看:22
本文介绍了ggstatsplot中的“​​提供了非限定值"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用ggstatsplot来获取我的统计分析的直观表示.

I am working with ggstatsplot to get visual representations of my statistical analyses.

我有许多数据集,所有数据集的构成都非常相似.有些可以正常工作,而有些则不能.data1是一个有效的示例,而data2不起作用.

I have numerous datasets, all very similar in make-up. Some work just fine, while others don't. data1 is a working example, and data2 doesn't work.

 data1 <- structure(list(
     treatment = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 
     2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
     3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 
     5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
     6L),
     .Label = c("negative_ctrl", "positive_ctrl", "treatmentA", "treatmentB", "treatmentC", "treatmentD"), class = "factor"),
     
     value = c(1.74501, 2.04001, 1.89501, 1.84001, 
     1.89501, 9.75001, 8.50001, 8.80001, 11.50001, 10.25001, 7.90001, 
     9.25001, 11.45001, 7.75001, 7.75001, 7.55001, 8.70001, 8.20001, 
     6.95001, 6.60001, 7.40001, 7.15001, 8.25001, 9.20001, 8.95001, 
     6.45001, 6.05001, 5.40001, 7.95001, 6.80001, 4.65001, 6.40001, 
     6.40001, 6.70001, 5.40001, 3.20001, 2.70001, 4.30001, 4.10001, 
     3.60001, 4.00001, 3.00001, 4.70001, 3.10001, 3.50001, 6.45001, 
     5.45001, 4.90001, 7.25001, 4.55001, 4.70001, 6.25001, 5.65001, 
     6.00001, 5.10001)),
     
     row.names = c(NA, -55L), class = c("tbl_df", "tbl", "data.frame"))

data2 <- structure(list(
     treatment = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 
     2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
     4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 
     5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L),
     .Label = c("negative_ctrl", "positive_ctrl", "treatmentA", "treatmentB", "treatmentC", "treatmentD"), class = "factor"), 
    
     value = c(1.00001, 1.00001, 1.00001, 1.00001, 1.00001, 6.77501, 
     5.68751, 5.99201, 8.24501, 7.01251, 4.79501, 5.99126, 8.26276, 
     5.35376, 5.38751, 4.60251, 5.38901, 4.85201, 4.44401, 5.20501, 
     6.20701, 5.77001, 4.05201, 3.65126, 3.02401, 4.68351, 3.90001, 
     2.56951, 3.70001, 3.61901, 3.96401, 2.93601, 1.53901, 1.40801, 
     2.05601, 2.08501, 1.89701, 1.79501, 1.50001, 2.09151, 1.53551, 
     1.57501, 3.88851, 3.09151, 2.75501, 4.40626, 2.42001, 2.60951, 
     3.83501, 3.37151, 3.70001, 2.92701)),
     
     row.names = c(NA, -52L), class = c("tbl_df", "tbl", "data.frame"))

我将这两个数据集称为最基本的分析:

I call the most basic analysis for both datasets:

library(Rmpfr)
library(ggstatsplot)

ggstatsplot::ggbetweenstats(
     data = data1, 
     x = treatment, 
     y = value,
     messages = FALSE )

ggstatsplot::ggbetweenstats(
     data = data2, 
     x = treatment, 
     y = value,
     messages = FALSE )

对于data1,我得到了:

For data1 I get this:

对于data2我得到:

for data2 I get:

> Error in stats::optim(par = 1.1 * rep(lambda, 2), fn = function(x) { : non-finite value supplied by optim

起初,我认为问题可能是我在阴性对照中传递的几个零,但我先将它们调高了一个很小的数量,然后调高了1,以确保值的范围不是问题.我能看到的唯一差异是,data2中的处理A(级别3)只有7个测量值,而不是10个测量值,而data1中只有10个测量值(由于样本失败,不得不删除一些NA).但是,在这两种情况下,阴性对照(第1级)都只有5个值,而且我认为在这种类型的分析中,两组之间的样本数量不同不会带来问题.

At first I thought the issue might be a few zeros that I passed on in the negative control, but I first upped them by a tiny amount and then by 1 to make sure the range of the values is not an issue. The only discrepancy I can see is that I only have 7 instead of 10 measurements for treatmentA (level 3) in data2 but 10 in data1 (had to remove a few NAs due to sample failure). However, in both cases the negative control (level 1) only has 5 values, and I don't think that in this type of analysis there is an issue with different sample sizes between the groups.

推荐答案

在这些情况下尝试进行基本绘图是一个好主意,例如,隔离箱形图:

It's a good idea to try basic plots out in these cases eg isolate the boxplots:

因此比较两个数据集:

boxplot(value ~ treatment, data=data1)
boxplot(value ~ treatment, data=data2)

data2 的处理无变化("negative_ctrl" ),SD为0.我猜想此功能正在做一些需要变化的测试.您将需要阅读该函数的文档以查看该函数是否被提出,但是您可以通过删除这些处理或强制进行少量改动来获取视图,例如

data2 has a treatment with no variability ("negative_ctrl"), 0 SD. I'm guessing this function is doing some tests that require variation. You will need to read the documentation for the function to see if this is brought up but you can get views either by removing these treatments, or forcing a very small amount of variation eg

# run without negative_ctrl
ggstatsplot::ggbetweenstats(
  data = data2[data2$treatment != "negative_ctrl",], 
  x = treatment, 
  y = value,
  messages = FALSE )

# add some tiny fake variation to force it through (this is a hack)
data3 <- data2
data3[data3$treatment=="negative_ctrl",][1,][["value"]] <- 1.0001
ggstatsplot::ggbetweenstats(
  data = data3, 
  x = treatment, 
  y = value,
  messages = FALSE )

这篇关于ggstatsplot中的“​​提供了非限定值"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆