如何在keras flow_from_directory中手动指定类标签? [英] How to manually specify class labels in keras flow_from_directory?

查看:656
本文介绍了如何在keras flow_from_directory中手动指定类标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:我正在训练一种用于多标签图像识别的模型.因此,我的图像与多个y标签相关联.这与ImageDataGenerator的便捷keras方法"flow_from_directory"冲突,在该方法中,每个图像都应位于相应标签的文件夹中( https://keras.io/preprocessing/image/).

Problem: I am training a model for multilabel image recognition. My images are therefore associated with multiple y labels. This is conflicting with the convenient keras method "flow_from_directory" of the ImageDataGenerator, where each image is supposed to be in the folder of the corresponding label (https://keras.io/preprocessing/image/).

解决方法::目前,我正在将所有图像读取到一个numpy数组中,并从此处使用流"功能.但这会导致沉重的内存负载和缓慢的读入过程.

Workaround: Currently, I am reading all images into a numpy array and use the "flow" function from there. But this results in heavy memory loads and a slow read-in process.

问题:是否可以使用"flow_from_directory"方法并手动提供(多个)类标签?

Question: Is there a way to use the "flow_from_directory" method and to supply manually the (multiple) class labels?

更新:我最终为多标签案例扩展了DirectoryIterator类.现在,您可以将属性"class_mode"设置为值"multilabel",并提供一个字典"multlabel_classes",该字典将文件名映射到其标签.代码: https://github.com/tholor/keras/commit/29ceafca3c4792cb480829c5768510e4bdb489c5 >

Update: I ended up extending the DirectoryIterator class for the multilabel case. You can now set the attribute "class_mode" to the value "multilabel" and provide a dictionary "multlabel_classes" which maps filenames to their labels. Code: https://github.com/tholor/keras/commit/29ceafca3c4792cb480829c5768510e4bdb489c5

推荐答案

您可以编写一个自定义生成器类,该类将从目录中读取文件并应用标签.该自定义生成器还可以使用ImageDataGenerator实例,该实例将使用flow()生成批处理.

You could write a custom generator class that would read the files in from the directory and apply the labeling. That custom generator could also take in an ImageDataGenerator instance which would produce the batches using flow().

我在想像这样的东西:

class Generator():

    def __init__(self, X, Y, img_data_gen, batch_size):
        self.X = X
        self.Y = Y  # Maybe a file that has the appropriate label mapping?
        self.img_data_gen = img_data_gen  # The ImageDataGenerator Instance
        self.batch_size = batch_size

    def apply_labels(self):
        # Code to apply labels to each sample based on self.X and self.Y

    def get_next_batch(self):
        """Get the next training batch"""
        self.img_data_gen.flow(self.X, self.Y, self.batch_size)

然后简单地:

img_gen = ImageDataGenerator(...)
gen = Generator(X, Y, img_gen, 128)

model.fit_generator(gen.get_next_batch(), ...)

*免责声明:我尚未对此进行实际测试,但理论上应该可行.

*Disclaimer: I haven't actually tested this, but it should work in theory.

这篇关于如何在keras flow_from_directory中手动指定类标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆