来自np.polyfit()的协方差矩阵是否有负对角线? [英] Covariance matrix from np.polyfit() has negative diagonal?

查看:171
本文介绍了来自np.polyfit()的协方差矩阵是否有负对角线?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:np.polyfit()cov=True选项会产生对角线,其值为负.

Problem: the cov=True option of np.polyfit() produces a diagonal with non-sensical negative values.

更新:在玩了更多之后,我真的开始怀疑numpy中的错误了吗?那可能吗?从数据集中删除任何一对13个值将解决此问题.

UPDATE: after playing with this some more, I am really starting to suspect a bug in numpy? Is that possible? Deleting any pair of 13 values from the dataset will fix the problem.

我正在使用np.polyfit()来计算数据集的斜率和截距系数.绘制这些值会生成一个非常线性(但不是很完美)的线性图.我正在尝试使用np.sqrt(np.diag(cov))获得这些系数的标准偏差;但是,这会引发错误,因为对角线包含负值.

I am using np.polyfit() to calculate the slope and intercept coefficients of a dataset. A plot of the values produces a very linear (but not perfectly) linear graph. I am attempting to get the standard deviation on these coefficients with np.sqrt(np.diag(cov)); however, this throws an error because the diagonal contains negative values.

从数学上讲,不可能产生对角线为负的协变量矩阵-numpy做错了什么?

以下是重现该问题的摘要:

Here is a snippet that reproduces the problem:

import numpy as np

x = [1476728821.797, 1476728821.904, 1476728821.911, 1476728821.920, 1476728822.031, 1476728822.039,
     1476728822.047, 1476728822.153, 1476728822.162, 1476728822.171, 1476728822.280, 1476728822.289,
     1476728822.297, 1476728822.407, 1476728822.416, 1476728822.423, 1476728822.530, 1476728822.539,
     1476728822.547, 1476728822.657, 1476728822.666, 1476728822.674, 1476728822.759, 1476728822.788,
     1476728822.797, 1476728822.805, 1476728822.915, 1476728822.923, 1476728822.931, 1476728823.038,
     1476728823.047, 1476728823.054, 1476728823.165, 1476728823.175, 1476728823.182, 1476728823.292,
     1476728823.300, 1476728823.308, 1476728823.415, 1476728823.424, 1476728823.432, 1476728823.551,
     1476728823.559, 1476728823.567, 1476728823.678, 1476728823.689, 1476728823.697, 1476728823.808,
     1476728823.828, 1476728823.837, 1476728823.947, 1476728823.956, 1476728823.964, 1476728824.074,
     1476728824.083, 1476728824.091, 1476728824.201, 1476728824.209, 1476728824.217, 1476728824.324,
     1476728824.333, 1476728824.341, 1476728824.451, 1476728824.460, 1476728824.468, 1476728824.579,
     1476728824.590, 1476728824.598, 1476728824.721, 1476728824.730, 1476728824.788]

y = [6309927, 6310105, 6310116, 6310125, 6310299, 6310317, 6310326, 6310501, 6310513, 6310523, 6310688,
     6310703, 6310712, 6310875, 6310891, 6310900, 6311058, 6311069, 6311079, 6311243, 6311261, 6311272,
     6311414, 6311463, 6311479, 6311490, 6311665, 6311683, 6311692, 6311857, 6311867, 6311877, 6312037,
     6312054, 6312065, 6312230, 6312248, 6312257, 6312430, 6312442, 6312455, 6312646, 6312665, 6312675,
     6312860, 6312879, 6312894, 6313071, 6313103, 6313117, 6313287, 6313304, 6313315, 6313489, 6313505,
     6313518, 6313675, 6313692, 6313701, 6313875, 6313888, 6313898, 6314076, 6314093, 6314104, 6314285,
     6314306, 6314321, 6314526, 6314541, 6314638]

z, cov = np.polyfit(np.asarray(x), np.asarray(y), 1, cov=True)

std = np.sqrt(np.diag(cov))

print z
print cov
print std

推荐答案

它似乎与您的x值有关:它们的总范围约为3,偏移量约为15亿.

It looks like it's related to your x values: they have a total range of about 3, with an offset of about 1.5 billion.

在您的代码中

np.asarray(x)

转换float64的ndarray中的x值.虽然可以正确地表示x值本身很好,但是可能不足以进行所需的计算以获得协方差矩阵.

converts the x values in a ndarray of float64. While this is fine to correctly represent the x values themselves, it might not be enough to carry on the required computations to get the covariance matrix.

np.asarray(x, dtype=np.float128)

可以解决问题,但是polyfit不能与float128一起使用:(

would solve the problem, but polyfit can't work with float128 :(

TypeError: array type float128 is unsupported in linalg

作为一种解决方法,您可以从x中减去偏移量,然后使用polyfit.这将产生一个具有正对角线的协方差矩阵:

As a workaround, you can subtract the offset from x and then using polyfit. This produces a covariance matrix with positive diagonal:

x1 = x - np.mean(x)
z1, cov1 = np.polyfit(np.asarray(x1), np.asarray(y), 1, cov=True)
std1 = np.sqrt(np.diag(cov1))

print z1    # prints: array([  1.56607841e+03,   6.31224162e+06])
print cov1  # prints: array([[  4.56066546e+00,  -2.90980285e-07],
            #                [ -2.90980285e-07,   3.36480951e+00]])
print std1  # prints: array([ 2.13557146,  1.83434171])

您必须相应地重新缩放结果.

You'll have to rescale the results accordingly.

这篇关于来自np.polyfit()的协方差矩阵是否有负对角线?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆