check_arrays() 在 scikit-learn 中限制数组维度? [英] check_arrays() limiting array dimensions in scikit-learn?

查看:67
本文介绍了check_arrays() 在 scikit-learn 中限制数组维度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 scikit-learn X0.15 中提供的 sklearn.learning_curves.py.克隆这个版本后,有几个函数不再起作用,因为 check_arrays() 将数组的维数限制为 2.

<预><代码>>>>从 sklearn 导入指标>>>从 sklearn.cross_validation 导入 train_test_split>>>将 numpy 导入为 np>>>X = np.random.random((10,2,2,2))>>>y = np.random.random((10,2,2,2))>>>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=3)>>>错误找到带有暗淡 4d 的数组.预期 <= 2"

使用相同的 X 和 y 我得到相同的错误.

<预><代码>>>>mse = metrics.mean_squared_error>>>mse(X,y)>>>错误找到带有暗淡 4d 的数组.预期 <= 2"

如果我转到 sklearn.utils.validation.py 并注释掉第 272、273 和 274 行,如下所示,一切正常.

# if array.ndim >= 3:# raise ValueError("Found array with dim %d. Expected <= 2" %#array.ndim)

为什么数组的维度被限制为 2?

解决方案

因为 scikit-learn 对所有特征数据使用二维约定 (n_samples × n_features).如果任何函数或方法允许高维数组通过,那通常只是疏忽,您不能真正依赖它.

I would like to use the sklearn.learning_curves.py available in scikit-learn X0.15. After I cloned this version, several functions no longer work because check_arrays() is limiting the dimension of the arrays to 2.

>>> from sklearn import metrics 
>>> from sklearn.cross_validation import train_test_split 
>>> import numpy as np
>>> X = np.random.random((10,2,2,2))
>>> y = np.random.random((10,2,2,2))
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=3)
>>> error "Found array with dim 4d. Expected <= 2"

Using the same X and y I get the same error.

>>> mse = metrics.mean_squared_error
>>> mse(X,y)
>>> error "Found array with dim 4d. Expected <= 2"

If I go to sklearn.utils.validation.py and comment out lines 272, 273, and 274 as shown below everything works just fine.

# if array.ndim >= 3:
#     raise ValueError("Found array with dim %d. Expected <= 2" %
#                      array.ndim)

Why are the dimensions of the arrays being limited to 2?

解决方案

Because scikit-learn uses a 2-d convention (n_samples × n_features) for all feature data. If any function or method lets a higher-d array through, that's usually just oversight and you can't really rely on it.

这篇关于check_arrays() 在 scikit-learn 中限制数组维度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆