使用 Keras 卷积网络的内存问题 [英] Memory Issues Using Keras Convolutional Network

查看:30
本文介绍了使用 Keras 卷积网络的内存问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对使用大数据的机器学习非常陌生,之前我曾使用 Keras 通用卷积示例进行狗/猫分类,但是当对我的一组图像应用类似的方法时,我遇到了内存问题.

I am very new to ML using Big Data and I have played with Keras generic convolutional examples for the dog/cat classification before, however when applying a similar approach to my set of images, I run into memory issues.

我的数据集由大小为 10048 x1687 像素的超长图像组成.为了避免内存问题,我使用批量大小为 1,一次向模型输入一张图像.

My dataset consists of very long images that are 10048 x1687 pixels in size. To circumvent the memory issues, I am using a batch size of 1, feeding in one image at a time to the model.

该模型有两个卷积层,每个卷积层都跟随着最大池化层,这使得扁平层在完全连接层之前有大约 290,000 个输入.

The model has two convolutional layers, each followed by max-pooling which together make the flattened layer roughly 290,000 inputs right before the fully-connected layer.

然而,在运行后立即,内存使用量达到其限制 (8Gb).

Immediately after running however, Memory usage chokes at its limit (8Gb).

所以我的问题如下:

1) 在本地(不使用云)在 Python 中处理如此大小的计算的最佳方法是什么?我需要使用其他 python 库吗?

推荐答案

查看 yield 在 python 中的作用以及生成器的想法.您无需在开始时加载所有数据.您应该使 batch_size 足够小,以免出现内存错误.您的生成器可能如下所示:

Check out what yield does in python and the idea of generators. You do not need to load all of your data at the beginning. You should make your batch_size just small enough that you do not get memory errors. Your generator can look like this:

def generator(fileobj, labels, memory_one_pic=1024, batch_size): 
  start = 0
  end = start + batch_size
  while True:
     X_batch = fileobj.read(memory_one_pic*batch_size)
     y_batch = labels[start:end]
     start += batch_size
     end += batch_size
     if not X_batch:
        break
     if start >= amount_of_datasets:
       start = 0
       end = batch_size
     yield (X_batch, y_batch)

...稍后当您已经准备好架构时...

...later when you already have your architecture ready...

train_generator = generator(open('traindata.csv','rb'), labels, batch_size)
train_steps = amount_of_datasets//batch_size + 1

model.fit_generator(generator=train_generator,
                     steps_per_epoch=train_steps,
                     epochs=epochs)

您还应该阅读有关 batch_normalization 的内容,这基本上有助于更快、更准确地学习.

You should also read about batch_normalization, which basically helps to learn faster and with better accuracy.

这篇关于使用 Keras 卷积网络的内存问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆