从尾部的 qnorm 获取高精度值 [英] Getting high precision values from qnorm in the tail
问题描述
我正在寻找尾部正态分布的高精度值(1e-10 and 1 - 1e-10)
,因为我使用的 R 包设置了任何超出的数字这个范围到这些值,然后调用 qnorm
和 qt
函数.
I am looking for high precision values for the normal distribution in the tail (1e-10 and 1 - 1e-10)
, as the R package that I am using sets any number which is out of this range to these values and then calls the qnorm
and qt
function.
我注意到的是,在查看尾部时,R 中的 qnorm
实现是不对称的.这让我很惊讶,因为众所周知这种分布是对称的,而且我已经看到其他语言的实现是对称的.我检查了 qt
函数,它的尾部也不对称.
What I have noticed is that the qnorm
implementation in R is not symmetric when looking at the tails. This is quite surprising to me, as it is well known that this distribution is symmetric, and I have seen implementations in other languages that are symmetric. I have checked the qt
function and it is also not symmetric in the tails.
以下是 qnorm 函数的结果:
Here are the results from the qnorm function:
x qnorm(x) qnorm(1-x) qnorm(1-x) + qnorm(x)
1e-2 -2.3263478740408408 2.3263478740408408 0.0 (i.e < machine epsilon)
1e-3 -3.0902323061678132 3.0902323061678132 0.0 (i.e < machine epsilon)
1e-4 -3.71901648545568 3.7190164854557084 2.8421709430404007e-14
1e-5 -4.2648907939228256 4.2648907939238399 1.014299755297543e-12
1e-10 -6.3613409024040557 6.3613408896974208 -1.2706634855419452e-08
很明显,当 x
的值接近 0 或 1 时,这个函数就会失效.是的,在正常"使用这不是问题,但我正在查看边缘情况并将小概率乘以非常大的值,在这种情况下,错误 (1e-08)
成为一个大值.
It is quite clear that at a value of x
close to 0 or 1, this function breaks down. Yes, in "normal" use this isn't a problem, but I am looking at fringe cases and multiplying small probabilities by very large values, in which case the error (1e-08)
becomes a large value.
注意:我已经尝试过使用 1-x
并输入实际数字 0.00001
和 0.99999
并且准确性问题仍然存在.
Note: I have tried this with 1-x
and with entering the actual number 0.00001
and 0.99999
and the accuracy issue is still there.
首先,这是 qnorm
和 qt
实现的已知问题吗?我在文档中找不到任何内容,对于 10^-314
中的 p 值,该算法应该是准确的 16 位数字,如 算法 AS 241 论文.
Firstly, is this a known problem with the qnorm
and qt
implementations? I could not find anything in the documentation, the algorithm is supposed to be accurate 16 digits for p values from 10^-314
as described in the Algorithm AS 241 paper.
引自 R 文档:
Wichura, M. J. (1988) 算法 AS 241:正态分布的百分比.应用统计学,37, 477–484.
Wichura, M. J. (1988) Algorithm AS 241: The percentage points of the normal distribution. Applied Statistics, 37, 477–484.
可提供高达约 16 位数的精确结果.
which provides precise results up to about 16 digits.
如果 R 代码实现了 7 位数字版本,为什么它声称是 16 位数字?或者它是准确的"吗?但原算法不对称,错误?
If the R code implements the 7 digit version, why does it claim 16 digits? Or is it "accurate" but the original algorithm is not symmetric and wrong?
如果 R 确实实现了 算法 AS 241 的两个版本,我可以吗?开启 16 位版本?
If R does implement both versions of Algorithm AS 241 can I turn the 16 digit version on?
或者,在 R 中有更准确的 qnorm
版本吗?或者,我的问题的另一种解决方案,我需要在分位数函数的尾部具有高精度.
Or, is there a more accurate version of qnorm
in R?
Or, another solution to my problem where I need high precision in the tails for quantile functions.
>version
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 3.2
year 2016
month 10
day 31
svn rev 71607
language R
version.string R version 3.3.2 (2016-10-31)
nickname Sincere Pumpkin Patch
推荐答案
事实证明(正如 Spencer Graves 在 他对 R-devel list-serve 上相同问题的回应)qnorm()
确实 事实上,正如宣传的那样.只是为了在分布的上尾获得高度准确的结果,您需要利用函数的 lower.tail
参数.
It turns out (as noted by Spencer Graves in his response to this same question on the R-devel list-serve) that qnorm()
does in fact perform as advertised. It's just that, to get highly accurate results in the distribution's upper tail, you'll need to avail yourself of the function's lower.tail
argument.
方法如下:
options(digits=22)
## For values of p in [0, 0.5], specify lower tail probabilities
qnorm(p = 1e-10) ## x: P(X <= x) == 1e-10
# [1] -6.3613409024040557
## For values of p in (0.5, 1], specify upper tail probabilities
qnorm(p = 1e-10, lower.tail=FALSE) ## x: P(X > x) == 1e-10 (correct approach)
# [1] 6.3613409024040557
qnorm(p = 1 - 1e-10) ## x: P(X <= x) == 1-(1e-1) (incorrect approach)
# [1] 6.3613408896974208
问题是 1-1e-10
(例如)会出现浮点舍入错误,因此它与 1
的距离实际上并不相同(区间的上端)因为 1e-10
来自 0
(区间的下端).潜在的问题(它是 R-FAQ 7.31!) 以更熟悉的形式变得显而易见:
The problem is that 1-1e-10
(for example) is subject to floating point rounding errors, such that it isn't really the same distance from 1
(the upper end of the interval) as 1e-10
is from 0
(the lower end of the interval). The underlying problem (it's R-FAQ 7.31!) becomes obvious when put in a more familiar guise:
1 - (1 - 1e-10) == 1e-10
## [1] FALSE
最后,这里快速确认一下 qnorm()
为其帮助文件中声明的值提供了准确(或至少对称)的结果:
Finally, here's a quick confirmation that qnorm()
provides accurate (or at least symmetrical) results out to the values claimed in its help file:
qnorm(1e-314)
## [1] -37.906647423565666
qnorm(1e-314, lower.tail=FALSE)
## [1] 37.906647423565666
## With this failing in just the way (and for just the reason) you'd now expect
qnorm(1-1e-314)
# [1] Inf
1 == (1-1e-314)
# [1] TRUE
这篇关于从尾部的 qnorm 获取高精度值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!