使用Keras卷积网络的内存问题 [英] Memory Issues Using Keras Convolutional Network

查看:65
本文介绍了使用Keras卷积网络的内存问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于使用大数据的ML来说我是一个新手,之前我曾使用过Keras通用的卷积示例进行狗/猫分类,但是当对我的图像集应用类似的方法时,我遇到了内存问题.

I am very new to ML using Big Data and I have played with Keras generic convolutional examples for the dog/cat classification before, however when applying a similar approach to my set of images, I run into memory issues.

我的数据集包含非常长的图像,大小为10048 x1687像素.为了避免出现内存问题,我使用的批量大小为1,一次向模型中馈入一个图像.

My dataset consists of very long images that are 10048 x1687 pixels in size. To circumvent the memory issues, I am using a batch size of 1, feeding in one image at a time to the model.

该模型具有两个卷积层,每个卷积层之后是最大池,它们共同使扁平层在完全连接的层之前大约有290,000个输入.

The model has two convolutional layers, each followed by max-pooling which together make the flattened layer roughly 290,000 inputs right before the fully-connected layer.

但是,运行后立即,内存使用量就达到了其极限(8Gb).

Immediately after running however, Memory usage chokes at its limit (8Gb).

所以我的问题如下:

1)在Python中本地处理这种大小的计算的最佳方法是什么(无云利用率)?我还需要使用其他Python库吗?

推荐答案

查看yield在python中的作用以及生成器的概念.您无需在一开始就加载所有数据.您应该将batch_size设置得足够小,以免出现内存错误. 您的生成器如下所示:

Check out what yield does in python and the idea of generators. You do not need to load all of your data at the beginning. You should make your batch_size just small enough that you do not get memory errors. Your generator can look like this:

def generator(fileobj, labels, memory_one_pic=1024, batch_size): 
  start = 0
  end = start + batch_size
  while True:
     X_batch = fileobj.read(memory_one_pic*batch_size)
     y_batch = labels[start:end]
     start += batch_size
     end += batch_size
     if not X_batch:
        break
     if start >= amount_of_datasets:
       start = 0
       end = batch_size
     yield (X_batch, y_batch)

...之后,当您已经准备好架构时...

...later when you already have your architecture ready...

train_generator = generator(open('traindata.csv','rb'), labels, batch_size)
train_steps = amount_of_datasets//batch_size + 1

model.fit_generator(generator=train_generator,
                     steps_per_epoch=train_steps,
                     epochs=epochs)

您还应该阅读有关batch_normalization的信息,它基本上有助于更快,更准确地学习.

You should also read about batch_normalization, which basically helps to learn faster and with better accuracy.

这篇关于使用Keras卷积网络的内存问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆