使用PCA时出现数学域错误 [英] math domain error while using PCA

查看:155
本文介绍了使用PCA时出现数学域错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用python的scikit-learn包来实现PCA.我正在学习数学

I am using python's scikit-learn package to implement PCA .I am getting math

domain error :
C:\Users\Akshenndra\Anaconda2\lib\site-packages\sklearn\decomposition\pca.pyc in _assess_dimension_(spectrum, rank, n_samples, n_features)
     78         for j in range(i + 1, len(spectrum)):
     79             pa += log((spectrum[i] - spectrum[j]) *
---> 80                       (1. / spectrum_[j] - 1. / spectrum_[i])) + log(n_samples)
     81 
     82     ll = pu + pl + pv + pp - pa / 2. - rank * log(n_samples) / 2.

ValueError: math domain error

我已经知道,当我们取负数的对数时会引起数学域错误,但是我不明白这里对数内怎么会有负数?因为此代码适用于其他数据集. 也许这与sci-kitlearn网站上写的内容有关-此实现使用奇异值分解的scipy.linalg实现.它仅适用于密集数组,不能扩展到大尺寸数据."(有很大的0个值的数量)

I already know that math domain error is caused when we take logarithm of a negative number ,but I don't understand here how can there be a negative number inside the logarithm ? because this code works fine for other datasets. maybe is this related to what is written in the sci-kitlearn's website -"This implementation uses the scipy.linalg implementation of the singular value decomposition. It only works for dense arrays and is not scalable to large dimensional data."(there are large number of 0 values)

推荐答案

我认为您应该改为加1,作为

I think you should add 1 instead, as the numpy log1p description page. Since log(p+1) = 0 when p = 0 (while log(e-99) = -99), and as the quote in the link

对于实值输入,log1p的x精度也是如此,以至于浮点精度为1 + x == 1

For real-valued input, log1p is accurate also for x so small that 1 + x == 1 in floating-point accuracy

可以对代码进行如下修改,以使您尝试解决的问题更加合理:

The code can be modified as follows to make what you trying to resolve more reasonable:

for i in range(rank):
    for j in range(i + 1, len(spectrum)):
        pa += log((spectrum[i] - spectrum[j]) *
        (1. / spectrum_[j] - 1. / spectrum_[i]) + 1) + log(n_samples + 1)
    ll = pu + pl + pv + pp - pa / 2. - rank * log(n_samples + 1) / 2

这篇关于使用PCA时出现数学域错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆