如何为predict_generator的keras模型编写一个生成器 [英] how to write a generator for keras model for predict_generator

查看:341
本文介绍了如何为predict_generator的keras模型编写一个生成器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个训练有素的keras模型,并且我试图仅使用CPU来运行预测.我希望此操作尽快完成,因此我想我将predict_generator与多个工作程序一起使用.我的预测张量的所有数据都预先加载到内存中.仅供参考,数组是张量的列表,第一个张量的形状为[nsamples,x,y,nchannels].我按照这里(我在使用fit_generator时也遵循了此操作).

I have a trained keras model, and I am trying to run predictions with CPU only. I want this to be as quick as possible, so I thought I would use predict_generator with multiple workers. All of the data for my prediction tensor are loaded into memory beforehand. Just for reference, array is a list of tensors, with the first tensor having shape [nsamples, x, y, nchannels]. I made a thread-safe generator following the instructions here (I followed this when using fit_generator as well).

class DataGeneratorPredict(keras.utils.Sequence):
    'Generates data for Keras'
    def __init__(self, array, batch_size=128):
        'Initialization'
        self.array = array
        self.nsamples = array[0].shape[0]
        self.batch_size = batch_size
        self.ninputs = len(array)
        self.indexes = np.arange(self.nsamples)

    def __len__(self):
        'Denotes the number of batches'
        print('nbatches:',int(np.floor(self.nsamples / self.batch_size)))
        return int(np.floor(self.nsamples / self.batch_size))

    def __getitem__(self, index):
        'Generate one batch of data'
        # Generate indexes of the batch
        print(index)
        inds = self.indexes[index*self.batch_size:(index+1)*self.batch_size]

        # Generate data
        X = []
        for inp in range(self.ninputs):
          X.append(self.array[inp][inds])

        return X

我像这样对模型进行预测

I run predictions with my model like so,

#all_test_in is my list of input data tensors
gen = DataGeneratorPredict(all_test_in, batch_size=1024)
new_preds = conv_model.predict_generator(gen,workers=4,use_multiprocessing=True)

但无论使用多少工人,我使用c2都无法获得任何速度改进.在拟合我的模型时,这似乎工作得很好(即,使用具有多个工人的发电机来加快速度).我在发生器中丢失了什么吗?有没有更有效的方法来优化预测(除了使用GPU外)?

but I don't get any speed improvement over using conv_model.predict, regardless of the number of workers. This seemed to work well when fitting my model (i.e., getting a speed-up using a generator with multiple workers). Am I missing something in my generator? Is there a more efficient way to optimize predictions (besides using GPU)?

推荐答案

当您仅调用.predict时,Keras 已经尝试使用所有可用的内核/并行预测您提供的数据点.在这种情况下,具有多个工作线程的预测生成器可能不会增加任何好处,因为每个工作线程都需要等待轮流执行或共享可用的内核.无论哪种方式,您最终都会获得相同的性能.

When you just call .predict, Keras already tries to use all available cores / predict in parallel the data points you give it. The predict generator with multiple workers might not add any benefit in this instance because each worker will need to wait for its turn to execute or share the available cores. Either way you end up getting the same performance.

如果您的数据使用生成器,则更为常见

Use of generators are more common if your data:

  • 不适合存储在内存中.您可以一次批处理并进行预测,而不必创建大型数据数组并调用预测.
  • 需要实时处理,该处理可能会更改/每批次随机.
  • 无法轻松地存储在NumPy数组中,并且具有除切片数据点以外的其他批处理方式.

这篇关于如何为predict_generator的keras模型编写一个生成器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆