在多个功能文件上训练Keras模型,这些功能文件被顺序读入以节省内存 [英] Training a Keras model on multiple feature files that are read in sequentially to save memory

查看:69
本文介绍了在多个功能文件上训练Keras模型,这些功能文件被顺序读入以节省内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试读取大量功能文件时遇到内存问题(请参阅下文).我想我会分割训练文件并按顺序阅读它们. 什么是最好的方法?

I'm running into memory issues when trying to read in massive feature files (see below). I figured I'd split the training files and read them in sequentially. What is the best approach to do that?

x_train = np.load(path_features + 'x_train.npy)
y_train = np.load(path_features + 'y_train.npy)
x_test = np.load(path_features + 'x_test.npy)
y_test = np.load(path_features + 'y_test.npy)

path_models = '../pipelines/' + pipeline + '/models/'

# global params
verbose_level = 1
inp_shape = x_train.shape[1:]

# models
if model_type == 'standard_4':
    print('Starting to train ' + feature_type + '_' + model_type + '.')
    num_classes = 1
    dropout_prob = 0.5
    activation_function = 'relu'
    loss_function = 'binary_crossentropy'
    batch_size = 32
    epoch_count = 100
    opt = SGD(lr=0.001)

    model = Sequential()
    model.add(Conv2D(filters=16, kernel_size=(3, 3), input_shape=inp_shape))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Conv2D(filters=32, kernel_size=(3, 3)))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Flatten())
    model.add(Dense(64, activation=activation_function))
    model.add(Dropout(rate=dropout_prob))
    model.add(Dense(32, activation=activation_function))
    model.add(Dense(num_classes, activation='sigmoid'))
    model.summary()
    model.compile(loss=loss_function, optimizer=opt, metrics=['accuracy'])
    hist = model.fit(x_train, y_train, batch_size=batch_size, epochs=epoch_count,
                     verbose=verbose_level,
                     validation_data=(x_test, y_test))

    model.save(path_models + category + '_' + feature_type + '_' + model_type + '.h5')
    print('Finished training ' + model_type + '.')

    plot_model(hist, path_models, category, feature_type, model_type)
    print('Saved model charts.')

推荐答案

您可以使用python generator

You can either use a python generator or a keras sequence.

生成器将无限期产生您的批次:

The generator should yield your batches indefinitely:

def myReader(trainOrTest):
    while True:
        do something to define path_features

        x = np.load(path_features + 'x_' + trainOrTest + '.npy')
        y = np.load(path_features + 'y_' + trainOrTest + '.npy')

        #if you're loading them already in a shape accepted by your model:
        yield (x,y)

然后您可以使用fit_generator进行训练,并使用predict_generator来预测值:

You can then use fit_generator to train and predict_generator to predict values:

model.fit_generator(myReader(trainOrTest),steps_per_epoch=howManyFiles,epochs=.......)

这篇关于在多个功能文件上训练Keras模型,这些功能文件被顺序读入以节省内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆