标准化的双矢量不单位长度到机器的精度 [英] Normalized Double Vector Not Unit Length To Machine Precision

查看:175
本文介绍了标准化的双矢量不单位长度到机器的精度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Java应用程序,使用由double形成的高维向量。它通过将向量分量乘以欧几里得范数的倒数来归一化这些向量。有时候,结果向量有一个不等于1的机器精度的标准。发生这种情况并不令我感到惊讶。

我的问题是:我如何规格化矢量,使得生成的矢量具有单位长度来加工精度? b
$ b

这些是我的Vector类来计算规范和规格化向量的方法:

$ p $ public double getFrobeniusNorm(){
return Math.sqrt(getFrobeniusNormSquared());


public double getFrobeniusNormSquared(){
double normSquared = 0.0;
int numberOfRows = getRowDimension();
int numberOfColumns = getColumnDimension();
for(int i = 0; i< numberOfRows; ++ i){
for(int j = 0; j< numberOfColumns; ++ j){
double matrixElement = get (I,J);
normSquared + = matrixElement * matrixElement;
}
}
return normSquared;

$ b $ public void normalize(){
double norm = getFrobeniusNorm();
if(norm == 0){
throw new ArithmeticException(无法从零向量获取单位向量);
} else {
double oneOverNorm = 1.0 / norm;
multiplyEquals(oneOverNorm);






$因为这是Java,所以我不能使用技巧具体到操作系统和处理器,否则这似乎是一个标准的浮点算法问题。

我可以改进使用卡农求和和/或划分的范数计算最大的组成部分,但规范化和计算规范之间的一致性是真正的问题。范数比方向更重要,所以我认为这是为了找到与原始矢量方向最接近的浮点向量,并且规范为1,以加工精度。对于我的目的,原始矢量是确切的。



假设原始矢量是 u 。我调用 u.normalize()。然后,如果我计算 Math.abs(u.getFrobeniusNorm() - 1d ,在某些情况下,结果是几百个ulps,这是我的问题。向量范数有错误,我只想对向量进行归一化,使得由 u.getFrobeniusNorm()计算的范数是1到最小可能的ulps。改善 u.getFrobeniusNorm()是有道理的,但是我认为这不能解决一致性问题。

Kahan Summation 。我重新回顾了单位矢量代码,发现规范化不是问题。 / p>

标准化方法包含三个步骤:


  1. 计算矢量 x 的规范。调用这个 x

  2. 将每个 x 的组件乘以1 / x 。得到的向量是 u

  3. 计算 u 的向量范数。调用这个 u 。你应该等于1到一些宽容。为了改进归一化方法,我用改进的总和计算了范数,用比较精确的范数的倒数来缩放分量,并使用改进的总和来计算单位向量的范数,以检查它是如何正常化的。果然,单位矢量被归一化到一个更低的公差,这是机器精度*尺寸。我比较改进的归一化方法和以前的方法,这是更好的。令我感到意外的是,如果第二个向量范数计算使用了改进的总和,那么旧的归一化方法也是一样准确的。因此,它不是引起问题的归一化本身,而是归一化的检查。看来,对于接近1的总和,朴素总和不太准确(即使在相对的意义上)比许多其他值更精确。我认为在实践中,所有量级矢量都出现了许多其他值,但是我怀疑有些矢量和一些和数有相同的单位矢量(总和接近1)的不良行为。然而,问题的规范值可能是稀疏分布在实数上,如2的幂。在原始方法中,问题在于两个向量范数计算不同的相对精度。如果你从一个接近一个的范数的向量开始,那么这两个计算的相对精度几乎是一样的,归一化本身就是不准确的。

    现在,我不知道不要计算单位向量的向量范数作为检查。

    I have a Java application that uses high-dimensional vectors formed from double's. It normalizes these vectors by multiplying the vector components by the reciprocal of the Euclidean norm. Sometimes, the resulting vector has a norm that is not equal to 1 to machine-precision. That this occurs does not surprise me.

    My question is: how do I normalize the vector such that the resulting vector has unit length to machine precision?

    These are the methods for my Vector class to compute the norm and normalize the vector:

    public double getFrobeniusNorm() {
        return Math.sqrt(getFrobeniusNormSquared());
    }
    
    public double getFrobeniusNormSquared() {
        double normSquared = 0.0;
        int numberOfRows = getRowDimension();
        int numberOfColumns = getColumnDimension();
        for(int i = 0; i < numberOfRows; ++i) {
            for(int j = 0; j < numberOfColumns; ++j) {
                double matrixElement = get(i,j);
                normSquared += matrixElement*matrixElement;
            }
        }
        return normSquared;
    }
    
    public void normalize() {
        double norm = getFrobeniusNorm();
        if (norm == 0) {
            throw new ArithmeticException("Cannot get a unit vector from the zero vector.");            
        } else {
            double oneOverNorm = 1.0 / norm;
            multiplyEquals(oneOverNorm);
        }
    }
    

    Since this is Java, I can't use techniques specific to the operating system and processor, but otherwise this seems like a standard floating-point algorithm issue.

    I can improve the norm calculation using Kahan summation and/or dividing out the largest component, but the consistency between normalizing and calculating the norm is the real issue. The norm is more important than the direction, so I see this as finding the floating point vector closest in direction to the original vector with the constraint that the norm is 1 to machine precision. For my purposes, the original vector is exact.

    Suppose the original vector is u. I call u.normalize(). Then, if I compute Math.abs(u.getFrobeniusNorm()-1d, in some cases, the result is hundreds of ulps. This is the problem. I can accept that the vector norm has error. I just want to normalize the vector such that the norm as calculated by u.getFrobeniusNorm() is 1 to the smallest possible ulps. Improving u.getFrobeniusNorm() makes sense, but I don't think that solves the consistency issue.

    解决方案

    While the original question remains interesting, recently I found the source of the problem in my code. In another code, I was improving a summation by implementing a variation of Kahan Summation. I revisited the unit vector code and found that the normalization was not the problem.

    The normalization method involves three steps:

    1. Calculate the vector norm of x. Call this x.
    2. Multiply each component of x by 1/x. The resulting vector is u.
    3. Calculate the vector norm of u. Call this u. u should equal 1 to some tolerance.

    To improve the normalization method, I calculated the norm with the improved summation, scaled the components by the reciprocal of the more accurate norm, and calculated the norm of the unit vector using the improved summation to check how well it was normalized. Sure enough, the unit vector was normalized to a much lower tolerance which was ~ machine precision * dimension. I compared the improved normalization method to previous method, and it was better. What surprised me was that the old normalization method was just as accurate if the second vector norm calculation used the improved summation.

    So it was not the normalization itself that caused the problem, but rather the check of the normalization. It appears that naive summation is less accurate (even in a relative sense) for sums near 1 than for many other values. I say "many other values" the original problem occurred for vectors of all magnitudes in practice, but I suspect that some vectors, and therefore some sums, have the same bad behavior as unit vectors (with sums near 1). However, the problem norm values are probably sparsely distributed over the real numbers, like powers of 2.

    In the original method, the problem was that the two vector norm calculations had different relative accuracies. If you start with a vector with a norm near one, the two calculations would have nearly identical relative accuracies, and the normalization itself would be inaccurate.

    So now, I don't calculate the vector norm of the unit vector as a check.

    这篇关于标准化的双矢量不单位长度到机器的精度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆