GridSearchCV用于神经元数量 [英] GridSearchCV for number of neurons
问题描述
我正在尝试自己学习如何在基本的多层神经网络中网格搜索神经元的数量.我正在使用Python的GridSearchCV和KerasClasifier以及Keras.下面的代码可以很好地用于其他数据集,但是由于某些原因,我无法使它适用于Iris数据集,而我找不到它的原因,我在这里遗漏了一些东西.我得到的结果是:
I am trying to learn by myself how to grid-search number of neurons in a basic multi-layered neural networks. I am using GridSearchCV and KerasClasifier of Python as well as Keras. The code below works for other data sets very well but I could not make it work for Iris dataset for some reasons and I cannot find it why, I am missing out something here. The result I get is:
Best: 0.000000 using {'n_neurons': 3}
0.000000 (0.000000) with: {'n_neurons': 3}
0.000000 (0.000000) with: {'n_neurons': 5}
Best: 0.000000 using {'n_neurons': 3}
0.000000 (0.000000) with: {'n_neurons': 3}
0.000000 (0.000000) with: {'n_neurons': 5}
from pandas import read_csv
import numpy
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from keras.wrappers.scikit_learn import KerasClassifier
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from sklearn.model_selection import GridSearchCV
dataframe=read_csv("iris.csv", header=None)
dataset=dataframe.values
X=dataset[:,0:4].astype(float)
Y=dataset[:,4]
seed=7
numpy.random.seed(seed)
#encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
#one-hot encoding
dummy_y = np_utils.to_categorical(encoded_Y)
#scale the data
scaler = StandardScaler()
X = scaler.fit_transform(X)
def create_model(n_neurons=1):
#create model
model = Sequential()
model.add(Dense(n_neurons, input_dim=X.shape[1], activation='relu')) # hidden layer
model.add(Dense(3, activation='softmax')) # output layer
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=10, initial_epoch=0, verbose=0)
# define the grid search parameters
neurons=[3, 5]
#this does 3-fold classification. One can change k.
param_grid = dict(n_neurons=neurons)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X, dummy_y)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))
出于说明和计算效率的目的,我仅搜索两个值.真诚的问到这个简单的问题,我深表歉意.顺便说一下,我是R的新手,因为我意识到深度学习社区正在使用python.
For the purpose of illustration and computational efficiency I search only for two values. I sincerely apologize for asking such a simple question. I am new to Python, switched from R, by the way because I realized that Deep Learning community is using python.
推荐答案
哈哈,这可能是我在Stack Overflow上经历过的最有趣的事情了:)检查:
Haha, this is probably the funniest thing I ever experienced on Stack Overflow :) Check:
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=5)
,您应该会看到不同的行为.您的模型获得完美分数(就cross_entropy
而言具有0
相当于可能的最佳模型)的原因是,您没有对数据进行混洗,并且由于Iris
包含每个供稿的三个平衡类像目标一样有一个单一的类:
and you should see different behavior. The reason why your model get a perfect score (in terms of cross_entropy
having 0
is equivalent to best model possible) is that you haven't shuffled your data and because Iris
consist of three balanced classes each of your feed had a single class like a target:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 (first fold ends here) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 (second fold ends here)2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
每种模型都非常容易解决此类问题-因此这就是为什么您拥有完美匹配的原因.
Such problems are really easy to be solved by every model - so that's why you've got a perfect match.
请尝试先重新整理数据-这将导致预期的行为.
Try to shuffle your data before - this should result in an expected behavior.
这篇关于GridSearchCV用于神经元数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!