Scikit学习中的分类测试,ValueError:使用序列设置数组元素 [英] Classification test in Scikit-learn, ValueError: setting an array element with a sequence

查看:78
本文介绍了Scikit学习中的分类测试,ValueError:使用序列设置数组元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用关于多类adaboost的教程,我正在尝试对具有两个类的某些图像进行分类(但是,如果问题是二进制的,我不认为该算法不起作用).然后,我将扩展我的样本以包括其他类.

Using the tutorial on multiclass adaboost, I'm trying to classify some images that have two classes (but I don't suppose the algorithm shouldn't work if the problem is binary). Then I'm going to extend my samples to include other classes.

我目前的测试非常小,总共只有17张图像,其中10张用于训练,7张用于测试.

My current test is quite small, only 17 images in all, 10 for training, 7 for testing.

现在我有两个课程:0: no vehicle, 1: vehicle present 我使用整数标签是因为根据上面链接中的示例,训练数据由基于整数的标签组成.

For now I have two classes: 0: no vehicle, 1: vehicle present I used integer labels because according to the example in the link above, the training data consists of integer-based labels.

我仅编辑了提供的示例以包含我自己的图像文件,但出现错误.

I've edited the provided example only a bit, to include my own image files, but I'm getting an error.

Traceback (most recent call last):
  File "C:\Users\app\Documents\Python Scripts\carclassify.py", line 66, in <module>
    bdt_discrete.fit(X_train, y_train)
  File "C:\Users\app\Anaconda\lib\site-packages\sklearn\ensemble\weight_boosting.py", line 389, in fit
    return super(AdaBoostClassifier, self).fit(X, y, sample_weight)
  File "C:\Users\app\Anaconda\lib\site-packages\sklearn\ensemble\weight_boosting.py", line 99, in fit
    X = np.ascontiguousarray(array2d(X), dtype=DTYPE)
  File "C:\Users\app\Anaconda\lib\site-packages\numpy\core\numeric.py", line 408, in ascontiguousarray
    return array(a, dtype, copy=False, order='C', ndmin=1)
ValueError: setting an array element with a sequence.

以下是我的代码,摘自scikit-learn网站上的示例:

The following is my code, adapted from the example on the scikit-learn website:

f = open("PATH_TO_SAMPLES\\samples.txt",'r')
out = f.read().splitlines()
import numpy as np

imgs = []
tmp_hogs = []
# 13 of the images are with vehicles, 4 are without
labels = [1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0]

for file in out:
        filepath = "C:\PATH_TO_SAMPLE_IMAGES\\" + file
        curr_img = color.rgb2gray(io.imread(filepath))
        imgs.append(resize(curr_img,(60,40)))
        fd, hog_image = hog(curr_img, orientations=8, pixels_per_cell=(16, 16),
                 cells_per_block=(1, 1), visualise=True)
        tmp_hogs.append(fd) 

img_hogs = np.array(tmp_hogs)
n_split = 10
X_train, X_test = img_hogs[:n_split], X[n_split:] # all first ten images with vehicles
y_train, y_test = labels[:n_split], labels[n_split:] # 3 images with vehicles, 4 without

#now all the code below is straight off the example on scikit-learn's website

bdt_real = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=2),
    n_estimators=600,
    learning_rate=1)

bdt_discrete = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=2),
    n_estimators=600,
    learning_rate=1.5,
    algorithm="SAMME")

bdt_real.fit(X_train, y_train)
bdt_discrete.fit(X_train, y_train)

real_test_errors = []
discrete_test_errors = []

for real_test_predict, discrete_train_predict in zip(
        bdt_real.staged_predict(X_test), bdt_discrete.staged_predict(X_test)):
    real_test_errors.append(
        1. - accuracy_score(real_test_predict, y_test))
    discrete_test_errors.append(
        1. - accuracy_score(discrete_train_predict, y_test))

n_trees = xrange(1, len(bdt_discrete) + 1)

pl.figure(figsize=(15, 5))

pl.subplot(131)
pl.plot(n_trees, discrete_test_errors, c='black', label='SAMME')
pl.plot(n_trees, real_test_errors, c='black',
        linestyle='dashed', label='SAMME.R')
pl.legend()
pl.ylim(0.18, 0.62)
pl.ylabel('Test Error')
pl.xlabel('Number of Trees')

pl.subplot(132)
pl.plot(n_trees, bdt_discrete.estimator_errors_, "b", label='SAMME', alpha=.5)
pl.plot(n_trees, bdt_real.estimator_errors_, "r", label='SAMME.R', alpha=.5)
pl.legend()
pl.ylabel('Error')
pl.xlabel('Number of Trees')
pl.ylim((.2,
        max(bdt_real.estimator_errors_.max(),
            bdt_discrete.estimator_errors_.max()) * 1.2))
pl.xlim((-20, len(bdt_discrete) + 20))

pl.subplot(133)
pl.plot(n_trees, bdt_discrete.estimator_weights_, "b", label='SAMME')
pl.legend()
pl.ylabel('Weight')
pl.xlabel('Number of Trees')
pl.ylim((0, bdt_discrete.estimator_weights_.max() * 1.2))
pl.xlim((-20, len(bdt_discrete) + 20))

# prevent overlapping y-axis labels
pl.subplots_adjust(wspace=0.25)
pl.show()

编辑

我输入

print tmp_hogs

输出是这样的:

[ array([ 0.27621208,  0.11038658,  0.10698133, ...,  0.08661556,        0.04612063,  0.0280782 ]), 
        array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., -1.29909838e-15,  -7.01780982e-17,  -1.24900943e-15]), 
        array([ 0.0503603 ,  0.1497235 ,  0.2372957 , ...,  0.07249325, 0.04545541,  0.00903818]), 
        array([ 0.27299191,  0.13122109,  0.0719268 , ...,  0.0848522 ,  0.04789403,  0.01387038]), 
        array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ...,  3.32140617e-17,  -6.58924128e-17,  -6.23567224e-16]), 
        array([ 0.37431874,  0.18094303,  0.01219871, ...,  0.06501856, 0.04855516,  0.02439321]), 
        array([ 0.41087302,  0.16478851,  0.03396399, ...,  0.09511273, 0.04077713,  0.03945513]), 
        array([ 0.17753915,  0.07025565,  0.09136909, ...,  0.03396507, 0.01379266,  0.01645722]), 
        array([ 0.40605587,  0.05915388,  0.03767763, ...,  0.08981079, 0.05452031,  0.01725399]), 
        array([ 0.        ,  0.        ,  0.        , ...,  0.00579303, 0.02053979,  0.0019091 ]), 
        array([ 0.31550735,  0.11988131,  0.07716529, ...,  0.09815158, 0.03058497,  0.02236517]), 
        array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., -3.51175682e-16,   1.31619418e-03,   2.86127901e-16]), 
        array([ 0.21381704,  0.22352378,  0.11568828, ...,  0.06311083, 0.02696666,  0.00402261]), 
        array([ 0.17480064,  0.1469145 ,  0.16336016, ...,  0.05614001, 0.03244093,  0.00524034]), 
        array([ 0.        ,  0.        ,  0.        , ...,  0.03089959, 0.00509584,  0.00247698]), 
        array([ 0.04711166,  0.0218663 ,  0.05316   , ...,  0.04214594, 0.04892439,  0.25840958]), 
        array([ 0.05357464,  0.00530857,  0.07162301, ...,  0.06802692, 0.08331959,  0.26619977])]

然后我跑了

print img_hogs

的输出是:

[ array([ 0.27621208,  0.11038658,  0.10698133, ...,  0.08661556, 0.04612063,  0.0280782 ])
 array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., -1.29909838e-15,  -7.01780982e-17,  -1.24900943e-15])
 array([ 0.0503603 ,  0.1497235 ,  0.2372957 , ...,  0.07249325, 0.04545541,  0.00903818])
 array([ 0.27299191,  0.13122109,  0.0719268 , ...,  0.0848522 , 0.04789403,  0.01387038])
 array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., 3.32140617e-17,  -6.58924128e-17,  -6.23567224e-16])
 array([ 0.37431874,  0.18094303,  0.01219871, ...,  0.06501856, 0.04855516,  0.02439321])
 array([ 0.41087302,  0.16478851,  0.03396399, ...,  0.09511273, 0.04077713,  0.03945513])
 array([ 0.17753915,  0.07025565,  0.09136909, ...,  0.03396507, 0.01379266,  0.01645722])
 array([ 0.40605587,  0.05915388,  0.03767763, ...,  0.08981079, 0.05452031,  0.01725399])
 array([ 0.        ,  0.        ,  0.        , ...,  0.00579303, 0.02053979,  0.0019091 ])
 array([ 0.31550735,  0.11988131,  0.07716529, ...,  0.09815158, 0.03058497,  0.02236517])
 array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., -3.51175682e-16,   1.31619418e-03,   2.86127901e-16])
 array([ 0.21381704,  0.22352378,  0.11568828, ...,  0.06311083, 0.02696666,  0.00402261])
 array([ 0.17480064,  0.1469145 ,  0.16336016, ...,  0.05614001, 0.03244093,  0.00524034])
 array([ 0.        ,  0.        ,  0.        , ...,  0.03089959, 0.00509584,  0.00247698])
 array([ 0.04711166,  0.0218663 ,  0.05316   , ...,  0.04214594, 0.04892439,  0.25840958])
 array([ 0.05357464,  0.00530857,  0.07162301, ...,  0.06802692, 0.08331959,  0.26619977])]

推荐答案

尝试:

imgs = []
tmp_hogs = np.zeros((17, 256))
# 13 of the images are with vehicles, 4 are without
labels = [1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0]

i = 0
for file in out:
        filepath = "C:\PATH_TO_SAMPLE_IMAGES\\" + file
        curr_img = color.rgb2gray(io.imread(filepath))
        imgs.append(resize(curr_img,(60,40)))
        fd, hog_image = hog(curr_img, orientations=8, pixels_per_cell=(16, 16),
                 cells_per_block=(1, 1), visualise=True)
        tmp_hogs[i,:] = fd
        i+=1

img_hogs = tmp_hogs

这篇关于Scikit学习中的分类测试,ValueError:使用序列设置数组元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆