check_arrays() 在 scikit-learn 中限制数组维度? [英] check_arrays() limiting array dimensions in scikit-learn?
问题描述
我想使用 scikit-learn X0.15 中提供的 sklearn.learning_curves.py.克隆这个版本后,有几个函数不再起作用,因为 check_arrays() 将数组的维数限制为 2.
<预><代码>>>>从 sklearn 导入指标>>>从 sklearn.cross_validation 导入 train_test_split>>>将 numpy 导入为 np>>>X = np.random.random((10,2,2,2))>>>y = np.random.random((10,2,2,2))>>>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=3)>>>错误找到带有暗淡 4d 的数组.预期 <= 2"使用相同的 X 和 y 我得到相同的错误.
<预><代码>>>>mse = metrics.mean_squared_error>>>mse(X,y)>>>错误找到带有暗淡 4d 的数组.预期 <= 2"如果我转到 sklearn.utils.validation.py 并注释掉第 272、273 和 274 行,如下所示,一切正常.
# if array.ndim >= 3:# raise ValueError("Found array with dim %d. Expected <= 2" %#array.ndim)
为什么数组的维度被限制为 2?
因为 scikit-learn 对所有特征数据使用二维约定 (n_samples
× n_features
).如果任何函数或方法允许高维数组通过,那通常只是疏忽,您不能真正依赖它.
I would like to use the sklearn.learning_curves.py available in scikit-learn X0.15. After I cloned this version, several functions no longer work because check_arrays() is limiting the dimension of the arrays to 2.
>>> from sklearn import metrics
>>> from sklearn.cross_validation import train_test_split
>>> import numpy as np
>>> X = np.random.random((10,2,2,2))
>>> y = np.random.random((10,2,2,2))
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=3)
>>> error "Found array with dim 4d. Expected <= 2"
Using the same X and y I get the same error.
>>> mse = metrics.mean_squared_error
>>> mse(X,y)
>>> error "Found array with dim 4d. Expected <= 2"
If I go to sklearn.utils.validation.py and comment out lines 272, 273, and 274 as shown below everything works just fine.
# if array.ndim >= 3:
# raise ValueError("Found array with dim %d. Expected <= 2" %
# array.ndim)
Why are the dimensions of the arrays being limited to 2?
Because scikit-learn uses a 2-d convention (n_samples
× n_features
) for all feature data. If any function or method lets a higher-d array through, that's usually just oversight and you can't really rely on it.
这篇关于check_arrays() 在 scikit-learn 中限制数组维度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!