StandardScaler -ValueError:输入包含NaN，无穷大或对于dtype('float64')而言太大的值 [英] StandardScaler -ValueError: Input contains NaN, infinity or a value too large for dtype('float64')

查看：1321 发布时间：2020/5/16 20:54:26 python nan

本文介绍了StandardScaler -ValueError:输入包含NaN，无穷大或对于dtype('float64')而言太大的值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下代码

X = df_X.as_matrix(header[1:col_num])
scaler = preprocessing.StandardScaler().fit(X)
X_nor = scaler.transform(X)

并出现以下错误:

  File "/Users/edamame/Library/python_virenv/lib/python2.7/site-packages/sklearn/utils/validation.py", line 54, in _assert_all_finite
    " or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

我用过:

I used:

print(np.isinf(X))
print(np.isnan(X))

这给了我下面的输出.这真的不能告诉我哪个元素有问题，因为我有数百万行.

which gives me the output below. This couldn't really tell me which element has issue as I have millions of rows.

[[False False False ..., False False False]
 [False False False ..., False False False]
 [False False False ..., False False False]
 ..., 
 [False False False ..., False False False]
 [False False False ..., False False False]
 [False False False ..., False False False]]

有没有办法确定矩阵X中的哪个值实际上导致了问题?人们一般如何避免使用它?

Is there a way to identify which value in the matrix X actually cause the problem? How do people avoid it in general?

推荐答案

numpy 包含针对此类事物的各种逻辑上的元素测试.

numpy contains various logical element-wise tests for this sort of thing.

在您的特定情况下，您将需要使用 isnan .

In your particular case, you will want to use isinf and isnan.

响应您的修改:

您可以将np.isinf()或np.isnan()的结果传递给np.where()，这将返回条件为true的索引.这是一个简单的示例:

You can pass the result of np.isinf() or np.isnan() to np.where(), which will return the indices where a condition is true. Here's a quick example:

import numpy as np

test = np.array([0.1, 0.3, float("Inf"), 0.2])

bad_indices = np.where(np.isinf(test))

print(bad_indices)

然后您可以使用这些索引来替换数组的内容:

You can then use those indices to replace the content of the array:

test[bad_indices] = -1

这篇关于StandardScaler -ValueError:输入包含NaN，无穷大或对于dtype('float64')而言太大的值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

StandardScaler -ValueError:输入包含NaN，无穷大或对于dtype('float64')而言太大的值 [英] StandardScaler -ValueError: Input contains NaN, infinity or a value too large for dtype('float64')

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

StandardScaler -ValueError:输入包含NaN，无穷大或对于dtype('float64')而言太大的值 [英] StandardScaler -ValueError: Input contains NaN, infinity or a value too large for dtype(&#39;float64&#39;)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

StandardScaler -ValueError:输入包含NaN，无穷大或对于dtype('float64')而言太大的值 [英] StandardScaler -ValueError: Input contains NaN, infinity or a value too large for dtype('float64')

登录关闭