ValueError:设置一个带有决策树序列的数组元素,其中所有行都具有相等的元素? [英] ValueError: setting an array element with a sequence with Decision Tree where all the rows have equal elements?

查看:102
本文介绍了ValueError:设置一个带有决策树序列的数组元素,其中所有行都具有相等的元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使决策树适合要素和标签的矩阵.这是我的代码:

I am trying to fit a decision tree to matrices of features and labels. Here is my code:

print FEATURES_DATA[0]
print ""
print TARGET[0]
print ""
print np.unique(list(map(len, FEATURES_DATA[0])))

给出以下输出:

[ array([[3, 3, 3, ..., 7, 7, 7],
       [3, 3, 3, ..., 7, 7, 7],
       [3, 3, 3, ..., 7, 7, 7],
       ..., 
       [2, 2, 2, ..., 6, 6, 6],
       [2, 2, 2, ..., 6, 6, 6],
       [2, 2, 2, ..., 6, 6, 6]], dtype=uint8)]

[ array([[31],
       [31],
       [31],
       ..., 
       [22],
       [22],
       [22]], dtype=uint8)]

[463511]

该矩阵实际上包含463511个样本.

The matrix actually contains 463511 samples.

此后,我运行以下代码块:

Thereafter, I run the following block:

from sklearn.tree import DecisionTreeClassifier
for i in xrange(5):
    Xtrain=FEATURES_DATA[i]
    Ytrain=TARGET[i]
    clf=DecisionTreeClassifier()
    clf.fit(Xtrain,Ytrain)

这给了我以下错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-3d8b2a7a3e5f> in <module>()
      4     Ytrain=TARGET[i]
      5     clf=DecisionTreeClassifier()
----> 6     clf.fit(Xtrain,Ytrain)

C:\Users\singhg2\AppData\Local\Enthought\Canopy\User\lib\site-packages\sklearn\tree\tree.pyc in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
    152         random_state = check_random_state(self.random_state)
    153         if check_input:
--> 154             X = check_array(X, dtype=DTYPE, accept_sparse="csc")
    155             if issparse(X):
    156                 X.sort_indices()

C:\Users\singhg2\AppData\Local\Enthought\Canopy\User\lib\site-packages\sklearn\utils\validation.pyc in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    371                                       force_all_finite)
    372     else:
--> 373         array = np.array(array, dtype=dtype, order=order, copy=copy)
    374 
    375         if ensure_2d:

ValueError: setting an array element with a sequence.

我在SO上搜索了其他帖子,发现大多数答案是矩阵不完全是数字,或者数组的样本长度不同.但是,这不是我的问题吗?

I searched other posts on SO and found that most of the answers were that the matrices were not completely numbers, or the array is differing in the length across samples. But, this is not the case with my problem?

有帮助吗?

推荐答案

如果print FEATURES_DATA[0] 实际打印

[ array([[3, 3, 3, ..., 7, 7, 7],
       [3, 3, 3, ..., 7, 7, 7],
       [3, 3, 3, ..., 7, 7, 7],
       ..., 
       [2, 2, 2, ..., 6, 6, 6],
       [2, 2, 2, ..., 6, 6, 6],
       [2, 2, 2, ..., 6, 6, 6]], dtype=uint8)]

然后问题是FEATURES_DATA [0]是其中包含numpy数组的python列表. (您可以从[]了解到这一点)

then the problem is that FEATURES_DATA[0] is a python list with a numpy array inside it. (You can understand that from the [ and ])

您可以选择列表中的第一个(也是唯一一个)元素来对其进行修复

You can select the first (and only) element of of the list to fix it

from sklearn.tree import DecisionTreeClassifier
for i in xrange(5):
    Xtrain=FEATURES_DATA[i][0]
    Ytrain=TARGET[i][0]
    clf=DecisionTreeClassifier()
    clf.fit(Xtrain,Ytrain)

这篇关于ValueError:设置一个带有决策树序列的数组元素,其中所有行都具有相等的元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆