使用keras.utils.Sequence时,keras预言生成器正在改组其输出 [英] keras predict_generator is shuffling its output when using a keras.utils.Sequence

查看:102
本文介绍了使用keras.utils.Sequence时,keras预言生成器正在改组其输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用keras构建一个模型,该模型输入720x1280图像并输出一个值.

I am using keras to build a model that inputs 720x1280 images and outputs a value.

使用keras.utils.Sequence类获取与验证/训练集上的图像对应的值时,我对keras.models.Sequential.predict_generator有问题.返回的值被混洗,所以我不知道哪个输出对应哪个图像.

I am having a problem with keras.models.Sequential.predict_generator when using the keras.utils.Sequence class to obtain the values corresponding to images on the validation/training sets. The values returned are shuffled, so I don't know which output corresponds to which image.

这是我的生成器的定义方式

This is how my generators are defined

from skimage.io import ImageCollection, imread 
from keras.utils import Sequence

def load_images(f):
    return imread(f).astype(np.float64)

class DataSetImageKeras(Sequence):
    def __init__(self, image_collection, values, batch_size):
        self.images = image_collection
        self.hf = values
        self.batch_size = batch_size
        self.n = len(self.images)
        self.x_scale = 250
        self.y_scale = 1e4

    def __len__(self):
        return int(np.ceil(len(self.images) / float(self.batch_size)))

    def __getitem__(self, idx):
        # batch_x is a numpy.ndarray
        batch_x = (
                self.images[idx:min(idx + self.batch_size, self.n)]
                .concatenate()
                .reshape(self.batch_size, 720, 1280, 1)
                ) 
        batch_y = self.hf[idx:min(idx + self.batch_size, self.n)]


        return batch_x/self.x_scale, batch_y/self.y_scale

images_train = ImageCollection(images_paths_train, load_func=load_images)
images_val = ImageCollection(images_paths_test, load_func=load_images)

data_train = DataSetImageKeras(images_train, values_train, n_batch)
data_val = DataSetImageKeras(images_val, values_val, n_batch)


from keras.models import load_model
model = load_model('model001') #this model is already trained

如果我使用以下代码:

val_result = []
val_hf =[]
for (batch_x, batch_y) in data_val:
    val_result.append(model.predict_on_batch(batch_x))
    val_hf.append(batch_y)

val_result = np.concatenate(val_result)
val_hf = np.concatenate(val_hf)

plt.plot(val_hf, 
         val_result,
         marker='.',
         linestyle='')

获得正确的结果(如此图像所示,其中x是期望值,y是预测值)

The correct result is obtained (as seen on this image where x is the desired value and y is the predicted value)

但是,如果我使用预测代码生成器功能,如下所示:

However if I use the predict_generator function, as below:

val_result = model.predict_generator(data_val, verbose=1,
                                     workers=1,
                                     max_queue_size=50,
                                     use_multiprocessing=False)

如下面的在此处所示,对输出进行了混洗.

The output is shuffled as can be seen here.

我的问题类似于 #5048 #6745 , 这应该通过解决 #6891 API,但我使用的是keras版本2.1.6,它是即使使用workers=1,仍然会拖慢我的预测.

My problem is similar to #5048 and #6745, which should be solved by #6891 API, but I am using keras version 2.1.6 and it is still shuffling my predictions, even when using workers=1.

它也类似于,但是我没有发现任何可以重置生成器的东西,如果我定义新的生成器并尝试运行predict_generator,此问题仍然存在.

It is also similar to this, but I didn't find anything that could reset the generators and this problem is still present if I define a new generator and try to run the predict_generator.

我还发现一些说明,这可能与批处理数量没有完全除以样本数量有关,但是如果我使用n_batch=1

I also found something stating that it could have something to do with the number of batches not dividing exactly the number of samples, but this problem is still present if I use n_batch=1

作为旁注,可能是因为prediction_generator不会对数据进行混排,而是仅将索引偏移量返回,因为valuesimages_paths上的输入数据已经被混排.

As a side note, it might be that predict_generator is not shuffling data, but only returning it with an index offset, since the input data on values and images_paths are already shuffled.

推荐答案

predict_generator并没有拖延我的预测.问题出在__getitem__方法上.例如,使用n_batch=32,该方法将产生从1到32,然后从2到33等的值,而不是从1到32、33到64等.

predict_generator was not shuffling my predictions, after all. The problem was with the __getitem__ method. For instance, usingn_batch=32, the method would yield values from 1 to 32, then from 2 to 33 and so forth, instead of from 1 to 32, 33 to 64, etc.

更改以下方法可以解决问题

Changing the method as follows solves the problem

 def __getitem__(self, idx):
    # batch_x is a numpy.ndarray
    idx_min = idx*self.batch_size
    idx_max = min(idx_min + self.batch_size, self.n)
    batch_x = (
            self.images[idx_min:idx_max]
            .concatenate()
            .reshape(self.batch_size, 720, 1280, 1)
            ) 
    batch_y = self.hf[idx_min:idx_max]

这篇关于使用keras.utils.Sequence时,keras预言生成器正在改组其输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆