当 Z 分数很大(p 值远低于零)时,如何从 R 中的 z 分数计算 p 值? [英] How to compute p-values from z-scores in R when the Z score is large (pvalue much below zero)?

查看:116
本文介绍了当 Z 分数很大(p 值远低于零)时,如何从 R 中的 z 分数计算 p 值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在遗传学中,非常小的 p 值很常见(例如 10^-400),我正在寻找一种方法,当 R 中的 z 分数很大时,可以得到非常小的 p 值(双尾),例如:

In genetics very small p-values are common (for example 10^-400), and I am looking for a way to get very small p-values (two-tailed) when the z-score is large in R, for example:

z=40
pvalue = 2*pnorm(abs(z), lower.tail = F)

这给了我一个零而不是一个非常重要的非常小的值.

This gives me a zero instead of a very small value which is very significant.

推荐答案

无法处理小于约 10^(-308) (.Machine$double.xmin) 的 p 值不是这确实是 R 的错,而是任何使用双精度(64 位)浮点数存储数字信息的计算系统的通用限制.

The inability to handle p-values less than about 10^(-308) (.Machine$double.xmin) is not really R's fault, but is rather a generic limitation of any computational system that uses double precision (64-bit) floats to store numeric information.

通过对数尺度计算来解决问题并不难,但是你不能将结果作为数值存储在R中;相反,您需要将结果存储(或打印)为尾数加指数.

It's not hard to solve the problem by computing on the log scale, but you can't store the result as a numeric value in R; instead, you need to store (or print) the result as a mantissa plus exponent.

pvalue.extreme <- function(z) {
   log.pvalue <- log(2) + pnorm(abs(z), lower.tail = FALSE, log.p = TRUE)
   log10.pvalue <- log.pvalue/log(10) ## from natural log to log10
   mantissa <- 10^(log10.pvalue %% 1)
   exponent <- log10.pvalue %/% 1
   ## or return(c(mantissa,exponent))
   return(sprintf("p value is %1.2f times 10^(%d)",mantissa,exponent))
}

使用不太极端的情况进行测试:

Test with a not-too-extreme case:

pvalue.extreme(5)
## [1] "p value is 5.73 times 10^(-7)"
2*pnorm(5,lower.tail=FALSE)
## [1] 5.733031e-07

更极端:

pvalue.extreme(40)
## [1] "p value is 7.31 times 10^(-350)"

在 R 中有多种包可以处理具有扩展精度的极大/极小数(Brobdingnag,Rmpfr,...)例如,

There are a variety of packages that handle extremely large/small numbers with extended precision in R (Brobdingnag, Rmpfr, ...) For example,

2*Rmpfr::pnorm(mpfr(40, precBits=100), lower.tail=FALSE, log.p = FALSE)
## 1 'mpfr' number of precision  100   bits 
## [1] 7.3117870818300594074979715966414e-350

但是,您将在计算效率和使用任意精度系统的便利性方面付出巨大代价.

However, you will pay a big cost in computational efficiency and convenience for working with an arbitrary-precision system.

这篇关于当 Z 分数很大(p 值远低于零)时,如何从 R 中的 z 分数计算 p 值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆