使用 Keras 卷积网络的内存问题 [英] Memory Issues Using Keras Convolutional Network
问题描述
我对使用大数据的机器学习非常陌生,之前我曾使用 Keras 通用卷积示例进行狗/猫分类,但是当对我的一组图像应用类似的方法时,我遇到了内存问题.
I am very new to ML using Big Data and I have played with Keras generic convolutional examples for the dog/cat classification before, however when applying a similar approach to my set of images, I run into memory issues.
我的数据集由大小为 10048 x1687 像素的超长图像组成.为了避免内存问题,我使用批量大小为 1,一次向模型输入一张图像.
My dataset consists of very long images that are 10048 x1687 pixels in size. To circumvent the memory issues, I am using a batch size of 1, feeding in one image at a time to the model.
该模型有两个卷积层,每个卷积层都跟随着最大池化层,这使得扁平层在完全连接层之前有大约 290,000 个输入.
The model has two convolutional layers, each followed by max-pooling which together make the flattened layer roughly 290,000 inputs right before the fully-connected layer.
然而,在运行后立即,内存使用量达到其限制 (8Gb).
Immediately after running however, Memory usage chokes at its limit (8Gb).
所以我的问题如下:
1) 在本地(不使用云)在 Python 中处理如此大小的计算的最佳方法是什么?我需要使用其他 python 库吗?
推荐答案
查看 yield
在 python 中的作用以及生成器的想法.您无需在开始时加载所有数据.您应该使 batch_size
足够小,以免出现内存错误.您的生成器可能如下所示:
Check out what yield
does in python and the idea of generators. You do not need to load all of your data at the beginning. You should make your batch_size
just small enough that you do not get memory errors.
Your generator can look like this:
def generator(fileobj, labels, memory_one_pic=1024, batch_size):
start = 0
end = start + batch_size
while True:
X_batch = fileobj.read(memory_one_pic*batch_size)
y_batch = labels[start:end]
start += batch_size
end += batch_size
if not X_batch:
break
if start >= amount_of_datasets:
start = 0
end = batch_size
yield (X_batch, y_batch)
...稍后当您已经准备好架构时...
...later when you already have your architecture ready...
train_generator = generator(open('traindata.csv','rb'), labels, batch_size)
train_steps = amount_of_datasets//batch_size + 1
model.fit_generator(generator=train_generator,
steps_per_epoch=train_steps,
epochs=epochs)
您还应该阅读有关 batch_normalization
的内容,这基本上有助于更快、更准确地学习.
You should also read about batch_normalization
, which basically helps to learn faster and with better accuracy.
这篇关于使用 Keras 卷积网络的内存问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!