小数点-语言R中的概率值为0 [英] Decimal points - Probability value of 0 in Language R

查看:116
本文介绍了小数点-语言R中的概率值为0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何处理R中的p值?

我期望p值非常低,例如:

I am expecting very low p values like:

1.00E-80

我需要-log10

-log10(1.00E-80)

-log10(0)也是Inf,但在舍入意义上也是Inf。

-log10(0) is Inf, but Inf at sense of rounding too.

但似乎在1.00E-308之后,R产生0。

But is seems that after 1.00E-308, R yields 0.

1/10^308  
[1] 1e-308

 1/10^309 
[1] 0

p值显示的准确性与 lm 函数具有相同的截止点1e-308,或者它的设计目的是我们需要一个截止点,而我需要考虑一个不同的截止点-如1e-100(例如),将0替换为< 1e-100。

Is the accuracy of p-value display with lm function the same as the cutoff point, 1e-308, or it is just designed such that we need a cutoff point and I need to consider a different cutoff point - such as 1e-100 (for example) to replace 0 with <1e-100.

推荐答案

有多种可能的答案-其中最有用的取决于上下文:

There are a variety of possible answers -- which one is most useful depends on the context:


  • 在通常情况下,R确实无法存储比接近零的浮点值。Machine $ double.xmin ,具体取决于平台,但通常(如您所见)在 1e-308 的顺序上。如果您确实需要使用这么小的数字,并且找不到直接在对数刻度上工作的方法,则需要在Stack Overflow或R Wiki中搜索用于处理任意/扩展精度值的方法(但您可能应该尝试在对数刻度上工作-麻烦就少了

  • 在许多情况下,R实际上会在内部(自然)对数刻度上计算p值,如果需要的话,可以返回日志值,而不是在给出答案之前对它们取幂。例如, dnorm(-100,log = TRUE)给出-5000.919。您可以通过除以 log(10)直接转换为log10刻度(不进行幂运算,然后使用 log10 ): dnorm(-100,log = TRUE)/ log(10) =-2171,该值太小而无法以浮点数表示。对于 p *** (累积分布函数)函数,请使用 log.p = TRUE 而不是 log = TRUE 。 (这一点在很大程度上取决于您的特定上下文。即使您没有使用内置的R函数,您也可以找到一种以对数刻度提取结果的方法。)

  • 在某些情况下,即使已知更精确的值,R也会显示p值结果为 <2.2e-16 (t1 <- t.test(rnorm(10,100),rnorm(10,80)))

  • R is indeed incapable under ordinary circumstances of storing floating-point values closer to zero than .Machine$double.xmin, which varies by platform but is typically (as you discovered) on the order of 1e-308. If you really need to work with numbers this small and can't find a way to work on the log scale directly, you need to search Stack Overflow or the R wiki for methods for dealing with arbitrary/extended precision values (but you probably should try to work on the log scale -- it will be much less of a hassle)
  • in many circumstances R actually computes p values on the (natural) log scale internally, and can if requested return the log values rather than exponentiating them before giving the answer. For example, dnorm(-100,log=TRUE) gives -5000.919. You can convert directly to the log10 scale (without exponentiating and then using log10) by dividing by log(10): dnorm(-100,log=TRUE)/log(10)=-2171, which would be too small to represent in floating point. For the p*** (cumulative distribution function) functions, use log.p=TRUE rather than log=TRUE. (This particular point depends heavily on your particular context. Even if you are not using built-in R functions you may be able to find a way to extract results on the log scale.)
  • in some cases R presents p-value results as being <2.2e-16 even when a more precise value is known: (t1 <- t.test(rnorm(10,100),rnorm(10,80)))

打印

....
t = 56.2902, df = 17.904, p-value < 2.2e-16

但是您仍然可以从结果中提取精确的p值

but you can still extract the precise p-value from the result

> t1$p.value
[1] 1.856174e-18

(在许多情况下,这是行为由 format.pval()函数控制)

(in many cases this behaviour is controlled by the format.pval() function)

如何使用 lm

d <- data.frame(x=rep(1:5,each=10))
set.seed(101)
d$y <- rnorm(50,mean=d$x,sd=0.0001)
lm1 <- lm(y~x,data=d)

摘要(lm1)将斜率的p值打印为< 2.2e-16 ,但是如果我们使用 coef(summary(lm1 ))(不使用p值格式),我们可以看到该值为9.690173e-203。

summary(lm1) prints the p-value of the slope as <2.2e-16, but if we use coef(summary(lm1)) (which does not use the p-value formatting), we can see that the value is 9.690173e-203.

A更多极端情况:

set.seed(101); d$y <- rnorm(50,mean=d$x,sd=1e-7)
lm2 <- lm(y~x,data=d)
coef(summary(lm2))

表示p值实际上已经降为零。但是,我们仍然可以获得对数刻度的答案:

shows that the p-value has actually underflowed to zero. However, we can still get an answer on the log scale:

tval <- coef(summary(lm2))["x","t value"]
2*pt(abs(tval),df=48,lower.tail=FALSE,log.p=TRUE)/log(10)

给出-692.62(您可以在前面的示例中检查此方法,其中p值不会溢出,并且看到相同的值答案显示在摘要中。)

gives -692.62 (you can check this approach with the previous example where the p-value doesn't overflow and see that you get the same answer as printed in the summary).

这篇关于小数点-语言R中的概率值为0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆