R包“ nparcomp”中的数据集限制 [英] dataset limitation in R package "nparcomp"

查看:279
本文介绍了R包“ nparcomp”中的数据集限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我正在使用R包 nparcomp ,并用它来测试类别之间响应变量的显着差异。

I am using the R package nparcomp recently and I used it to test the significant difference of my response variable between the categories.

我发现 nparcomp 函数不能处理大数据量(行数> 5000)。例如,这是我的代码:

I found out that the nparcomp function can not deal with large size of data (number of rows>5000). For example, here is my code:

a<-nparcomp(oc20_kgm2~ decade, data=dat, asy.method = "mult.t",
            type = "Tukey",alternative = "two.sided", 
            plot.simci = TRUE, info = FALSE)

summary(a)

其中, oc20_kgm2 是我的响应变量, decade 是我的因子(有10个类别), dat 是我的数据集。我的原始数据集大约有15,000行/样本。如果我运行上面的代码,则错误显示:

where, oc20_kgm2 is my response variable, decade is my factor (with 10 categories), dat is my dataset. My original dataset has about 15,000 rows/samples. If I run the code above, the error showed:

Error in checkmvArgs(lower = lower, upper = upper, mean = delta, corr = corr,  : 
  ‘lower’ not specified or contains NA
In addition: There were 49 warnings (use warnings() to see them)

要进行诊断,我必须从原始的 dat 中随机选择5,000个样本。另外,5,500个样本或10,000个样本无效。

So to diagnose, I have to randomly select 5,000 samples from my original dat. And then I run the same code above, it works. In addition, 5,500 samples or 10,000 samples don't work.

我的问题是,运行此样本的数量是否有限制

My question is, is there a limitation of sample size to run this function? And is there any other test function/package that I can use in R?

阅读评论后的修订:还有其他测试功能/程序包可以在R中使用吗?

Revision after reading the comment:

traceback()

4: stop(sQuote("lower"), " not specified or contains NA")
3: checkmvArgs(lower = lower, upper = upper, mean = delta, corr = corr, 
       sigma = sigma)
2: pmvt(lower = -abs(T[pp]), abs(T[pp]), corr = rho.bf, df = df.sw, 
       delta = rep(0, nc))
1: nparcomp(oc20_kgm2 ~ decade, data = dat2, asy.method = "mult.t", 
       type = "Tukey", alternative = "two.sided", plot.simci = TRUE, 
       info = FALSE)

> warnings()
Warning messages:
1: In n[j] * n[w] * n[i] : NAs produced by integer overflow
2: In n[i] * n[w] * n[j] : NAs produced by integer overflow
3: In n[i] * n[v] * n[j] : NAs produced by integer overflow
4: In cov2cor(cov.bf) :
  diag(.) had 0 or NA entries; non-finite result is doubtful


推荐答案

发生此错误是因为 n (每个因子的大小)是一个整数列表,因此容易受到大数值整数溢出的影响。要修复它,请修改nparcomp的源代码,

This error occurs because n, the size of each factor, is a list of integers and therefore vulnerable to integer overflow at large values. To fix it, modify the source code of nparcomp from

n <- sapply(samples, length)

n <- as.numeric(sapply(samples, length))

要查看源代码,请键入 nparcomp 在R提示符下。

To view the source code, type nparcomp at an R prompt.

这篇关于R包“ nparcomp”中的数据集限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆