使用ScikitLearn的神经网络实现时出现问题 [英] Problems while using ScikitLearn's Neural Network implementation

查看:192
本文介绍了使用ScikitLearn的神经网络实现时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Scikit Learn提供的神经网络实现来实现图像处理。我以'JPG'格式接近10,000张彩色图像,我将这些图像转换为'PNG'格式并删除了颜色信息。新图像均为黑色或白色图像。在将这些图像转换为矢量格式之后,这些图像矢量形成了对神经网络的输入。



对于每个图像,还有一个输出,它构成了神经网络的输出。



输入文件只有0和1的值,没有别的。每个图像的输出对应于一个连续的矢量,介于0和1之间,长度为22。即每个图像的输出是一个长度为22的向量。



为了开始处理,我开始只有100个图像及其相应的输出,并得到以下错误:

  ValueError:数组包含NaN或无穷大

我还想指出第一次迭代在这里完成,我在第二次迭代中遇到了这个错误。



试试不同的东西,我修剪我的输入和输出每个10图像。使用相同的代码(即将推出),我能够完成7次迭代(我已经将迭代次数设置为20次),然后收到相同的错误。



然后我将迭代次数更改为5,只是为了检查它是否有效。在此更改后,我收到以下错误:

  ValueError:输入形状错误(10,22)

我还尝试在输入上使用 np.reval()输出但是再次给了我 NaN或Infinity 错误。



这是我用于整个过程的代码:

  import numpy as np 
import csv
import matplotlib.pyplot as plt
from scipy.ndimage导入卷积来自sklearn导入的
linear_model,数据集,指标
来自sklearn.cross_validation import train_test_split
来自sklearn.neural_network导入BernoulliRBM
来自sklearn.pipeline import Pipeline


def ReadCsv(fileName):
in_file = open(fileName,'rUb')
reader = csv.reader(in_file,delimiter =',',quotechar =' ')
data = [[]]
for reader中的行:
data.append(row)

data.pop(0)
返回数据

X_train = n p.asarray(ReadCsv('100Images.csv'),'float32')
Y_train = np.asarray(ReadCsv('100Images_Y_new.csv'),'float32')
X_test = np.asarray( ReadCsv('ImagesForTest.csv'),'float32')
Y_test = np.asarray(ReadCsv('ImagesForTest_Y_new.csv'),'float32')

logistic = linear_model.LogisticRegression( )
rbm = BernoulliRBM(random_state = 0,verbose = True)

classifier = Pipeline(steps = [('rbm',rbm),('logistic',logistic)])

rbm.learning_rate = 0.06
rbm.n_iter = 5

rbm.n_components = 100
logistic.C = 6000.0

classifier.fit(X_train,Y_train)

print()
print(使用RBM功能的逻辑回归:\ n%s \ n%(
指标。 classification_report(
Y_test,
classifier.predict(X_test))))

我真的很感激这个问题的任何帮助。



TIA。

解决方案

将学习率更改为较小的值可能会解决此问题。 (即rbm.learning_rate)



至少这解决了我之前遇到的问题。


I am trying to implement image processing using Neural Network implementation given by Scikit Learn. I have close to 10,000 color images in 'JPG' format, I converted those images into 'PNG' format and removed the color information. The new images are all Black OR White images. After converting these images into vector format, these image vectors formed the input to the Neural Network.

To each image, there is an output as well which forms the output of the Neural Network.

The input file only has values of 0's and 1's and nothing else at all. The output for each image corresponds to a vector which is continuous, between 0 and 1 and is 22 in length. i.e. each image's output is a vector with length 22.

To start off with the processing, I began with only 100 images and their corresponding outputs and got the following error:

ValueError: Array contains NaN or infinity

I would also like to point out that the first iteration was completed here and I encountered this error during the second iteration.

To try something different, I trimmed my input and output down to 10 images each. Using the same piece of code (coming up shortly), I was able to complete 7 iterations (I had set the number of iterations to 20) and then received the same error.

I then changed the number of iterations to 5, just to check if it works. After this change, I got the following error:

ValueError: bad input shape (10, 22)

I also tried to use np.reval() on my input and output but that gave me NaN or Infinity error again.

Here is the code I am using for the whole process:

import numpy as np
import csv
import matplotlib.pyplot as plt
from scipy.ndimage import convolve
from sklearn import linear_model, datasets, metrics
from sklearn.cross_validation import train_test_split
from sklearn.neural_network import BernoulliRBM
from sklearn.pipeline import Pipeline


def ReadCsv(fileName):
    in_file = open(fileName, 'rUb')
    reader = csv.reader(in_file, delimiter=',', quotechar='"')
    data = [[]]
    for row in reader:
        data.append(row)

    data.pop(0)
    return data

X_train = np.asarray(ReadCsv('100Images.csv'), 'float32')
Y_train = np.asarray(ReadCsv('100Images_Y_new.csv'), 'float32')
X_test = np.asarray(ReadCsv('ImagesForTest.csv'), 'float32')
Y_test = np.asarray(ReadCsv('ImagesForTest_Y_new.csv'), 'float32')

logistic = linear_model.LogisticRegression()
rbm = BernoulliRBM(random_state=0, verbose=True)

classifier = Pipeline(steps=[('rbm', rbm), ('logistic', logistic)])

rbm.learning_rate = 0.06
rbm.n_iter = 5

rbm.n_components = 100
logistic.C = 6000.0

classifier.fit(X_train, Y_train)

print()
print("Logistic regression using RBM features:\n%s\n" % (
    metrics.classification_report(
        Y_test,
        classifier.predict(X_test))))

I would really appreciate any kind of help on this issue.

TIA.

解决方案

Change learning rate to a small value might fix this issue. (i.e rbm.learning_rate)

At least this fixed the problem I had before.

这篇关于使用ScikitLearn的神经网络实现时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆