keras模型无法概括 [英] keras model not able to generalise
问题描述
您能帮我发现我的keras模型有什么问题吗,因为自第二个时代以来,它就已经过拟合了. 以下是代码:
Can you help me to find what wrong with my keras model, because it is overfitting since the second epoch. the following is the code:
import random
import pandas as pd
import tensorflow as tf
import numpy
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras import backend as K
import glob, os
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import Normalizer
class CustomSaver(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if((epoch % 50)== 0 ):
model_json = self.model.to_json()
with open("model_{}.json".format(epoch), "w") as json_file:
json_file.write(model_json)
self.model.save_weights("model_weights_{}.h5".format(epoch))
self.model.save("model_{}.h5".format(epoch))
print("Saved model to disk")
model= tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=806, activation='relu',input_shape= (100,),activity_regularizer=tf.keras.regularizers.l1(0.01))) #50
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.Dense(units=806, activation='relu',activity_regularizer=tf.keras.regularizers.l1(0.01))) #50
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.Dense(units=806, activation='relu',activity_regularizer=tf.keras.regularizers.l1(0.01))) #50
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.Dense(units=14879, activation='softmax'))
optm = tf.keras.optimizers.Adam(learning_rate=0.0001, beta_1=0.9, beta_2=0.999, amsgrad=False)
model.compile(optimizer=optm,loss='categorical_crossentropy', metrics=['accuracy',tf.keras.metrics.Precision(),tf.keras.metrics.Recall()])
saver = CustomSaver()
encoder = LabelEncoder()
ds = pd.read_csv("all_labels.csv")
y = ds.iloc[:,0].values
encoder.fit(y)
dataset_val = pd.read_csv('validation_dataset.csv')
X_val = dataset_val.iloc[:,1:101].values
y_val = dataset_val.iloc[:,0].values
order = list(range(0,len(y_val)))
random.shuffle(order)
X_val = X_val[order,:]
y_val = y_val[order]
encoded_Y=encoder.transform(y_val)
y_val = tf.keras.utils.to_categorical(encoded_Y,14879)
X_val = X_val.astype('float32')
chunksize = 401999
co = 1
for dataset in pd.read_csv("training_dataset.csv", chunksize=chunksize):
if(co<38):
epoc = 100 #10
else:
epoc = 1000 #1000
print(co)
X = dataset.iloc[:,1:101].values
y = dataset.iloc[:,0].values
order =list(range(0,len(y)))
random.shuffle(order)
X = X[order,:]
y = y[order]
encoded_Y=encoder.transform(y)
y = tf.keras.utils.to_categorical(encoded_Y,14879)
X = X.astype('float32')
model.fit(X,y,validation_data=(X_val,y_val),callbacks=[saver],batch_size=10000,epochs=epoc,verbose=1) #epochs=20
co += 1
由于标签的数量过多(401999,14897),我遍历训练数据集使用块,to_categorical重新调整了内存.
I looped over the trainning dataset usning chunks becasue of the hunge number of lables (401999,14897), the to_categorical retunrs an out of memory.
包含所有标签的文件为:all_labels.csv( https://drive.google.com/file/d/1LwRBytg44_x62lfLkx9iKTbEhA5IsJM1/view?usp=sharing ).
包含验证数据集的文件为:validation_dataset.csv( https://drive.google. com/open?id = 1LZI2f-VGU3werjPIHUmdw0X_Q9nBAgXN )
The file which contains all lables is : all_labels.csv (https://drive.google.com/file/d/1UZvBTT9ZTM40fA5qJ8gdhmj-k6-SkpwS/view?usp=sharing).
The file which contains all training dataset is : training_dataset.csv (https://drive.google.com/file/d/1LwRBytg44_x62lfLkx9iKTbEhA5IsJM1/view?usp=sharing).
Ths file which contains validation dataset is : validation_dataset.csv (https://drive.google.com/open?id=1LZI2f-VGU3werjPIHUmdw0X_Q9nBAgXN)
训练数据集在传递到块循环之前的形状为:
X.shape =(14878999,100)
Y.shape =(14878999,)
The shape of the training dataset before being passed to the chunk loop is:
X.shape = (14878999, 100)
Y.shape = (14878999,)
推荐答案
您的问题来自您的数据:
Your problem comes from your data :
- 您正在尝试从形状输入(batch_size,100)中输出14879个值,您的网络无法从数据中学习到一些东西.
- 正如@Nopileos所说,10,000的批量大小实在太令人h目结舌,我认为您没有数亿的输入,因此请考虑使用更合理的批量大小!
如果您希望我们帮助您提供一些直觉,请添加您的输入/标签形状及其对应的形状!
Add your inputs/labels shape and what its corresponding too if you want us to help you to give some intuitions !
这篇关于keras模型无法概括的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!