Keras:为大型数据集批量加载图像 [英] Keras: load images batch wise for large dataset
问题描述
由于我有40GB的图像数据集,因此在keras中可能一次仅在内存中加载一批.
Its is possible in keras to load only one batch in memory at a time as I have 40GB dataset of images.
如果数据集很小,我可以使用ImageDataGenerator生成批处理,但是由于数据集很大,我无法将所有图像加载到内存中.
If dataset is small I can used ImageDataGenerator to generator batches but due large dataset I can't load all images in memory.
Keras中是否有任何方法可以执行类似于以下张量流代码的操作:
Is there any method in keras to do something similar to following tensorflow code:
path_queue = tf.train.string_input_producer(input_paths, shuffle= False)
paths, contents = reader.read(path_queue)
inputs = decode(contents)
input_batch = tf.train.batch([inputs], batch_size=2)
我正在使用此方法序列化张量流中的输入,但我不知道如何在Keras中完成此任务.
I am using this method to serialize inputs in tensorflow but I don't know how to achieve this task in Keras.
推荐答案
Keras在其模型中具有方法fit_generator()
.它接受python generator
或keras Sequence
作为输入.
Keras has the method fit_generator()
in its models. It accepts a python generator
or a keras Sequence
as input.
您可以像这样创建一个简单的生成器:
You can create a simple generator like this:
fileList = listOfFiles
def imageLoader(files, batch_size):
L = len(files)
#this line is just to make the generator infinite, keras needs that
while True:
batch_start = 0
batch_end = batch_size
while batch_start < L:
limit = min(batch_end, L)
X = someMethodToLoadImages(files[batch_start:limit])
Y = someMethodToLoadTargets(files[batch_start:limit])
yield (X,Y) #a tuple with two numpy arrays with batch_size samples
batch_start += batch_size
batch_end += batch_size
像这样适合:
model.fit_generator(imageLoader(fileList,batch_size),steps_per_epoch=..., epochs=..., ...)
通常,您将从发生器生成的批次数量传递给steps_per_epoch
.
Normally, you pass to steps_per_epoch
the number of batches you will take from the generator.
您还可以实现自己的 Keras序列.这需要更多的工作,但是如果您要进行多线程处理,他们建议使用此功能.
You can also implement your own Keras Sequence. It's a little more work, but they recommend using this if you're going to make multi-thread processing.
这篇关于Keras:为大型数据集批量加载图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!