具有增强图像和其他功能的Keras迭代器 [英] Keras iterator with augmented images and other features

查看:134
本文介绍了具有增强图像和其他功能的Keras迭代器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设您有一个包含图像的数据集,并且每个图像的 .csv 中都有一些数据。
您的目标是创建一个具有卷积分支和另一个卷积分支的NN(在我的情况下为MLP)。

Say you have a dataset that has images and some data in a .csv for each image. Your goal is to create a NN that has a convolution branch and an other one (in my case an MLP).

现在,有很多指南(此处为一个另一项)关于如何创建网络,这不是问题。

Now, there are plenty of guides (one here, another one) on how to create the network, that's not the problem.

这里的问题是如何创建形式的迭代器[ [convolution_input,other_features],target] ,当 convolution_input 来自Keras ImageDataGenerator 流程,用于添加增强图像。

The issue here is how do I create an iterator in the form of [[convolution_input, other_features], target] when the convolution_input is from a Keras ImageDataGenerator flow that adds augmented images.

更具体地说,当第n张图像(可能是增幅的或不增幅的)馈送到NN时,我希望它是内部的原始特征other_features

More specifically, when the nth image (that may be an augmented one or not) is fed to the NN, I want it's original features inside other_features.

我发现了几次尝试(此处

I found few attempts (here and here, the second one looked promising but I wasn't able to figure out how to handle augmented images) in doing exactly that but they do not seems to take into account the possible dataset manipulation that the Keras generator does.

推荐答案

比方说,您有一个csv,以便图像和其他功能都在文件中。

Let's say, you have a csv, such that you images and the other features are in the file.

其中 id 表示图像名称,然后是功能,然后是目标,(分类类,重新编号类)

Where id represents the image name, and followed by the features and followed by your target, (class for classification, number for regeression)

|         id          | feat1 | feat2 | feat3 | class |
|---------------------|-------|-------|-------|-------|
| 1_face_IMG_NAME.jpg |   1   |   0   |   1   |   A   |
| 3_face_IMG_NAME.jpg |   1   |   0   |   1   |   B   |
| 2_face_IMG_NAME.jpg |   1   |   0   |   1   |   A   |
|         ...         |  ...  |  ...  |  ...  |  ...  |

首先让我们定义一个数据生成器,然后再覆盖它。

First let us define a data generator and later we can override it.

让我们从熊猫数据框中的csv中读取数据,并使用keras的 flow_from_dataframe 从数据框中读取。

Let us read the data from the csv in a pandas dataframe and use keras's flow_from_dataframe to read from the dataframe.

df = pandas.read_csv("dummycsv.csv")
datagen = ImageDataGenerator(rescale=1/255.)
generator = datagen.flow_from_dataframe(
                df,
                directory="out/",
                x_col="id",
                y_col=df.columns[1:],
                class_mode="raw",
                batch_size=1)

您始终可以在 ImageDataGenerator

上面的代码中 flow_from_dataframe 要注意的事情是

Things to note in the above code in flow_from_dataframe is

x_col =图像名称

y_col =通常带有类名称的列,但是让我们稍后通过首先提供CSV中的所有其他列。即 feat_1 feat_2 ....直到class_label

y_col = typically columns with the class name, but let us override it later by first providing all the other columns in the csv. i.e. feat_1, feat_2.... till class_label

class_mode = 原始,建议生成器返回 y 中的所有值

class_mode = raw, suggest the generator to return all the values in y as is.

现在,让我们覆盖/继承上述生成器并创建一个新生成器,以使其返回[img,otherfeatures],[target]

Now let us override/inherit the above generator and create a new one, such that it returns [img, otherfeatures], [target]

下面是带有注释的代码作为解释:

Here is the code with comments as explanations:

def my_custom_generator():
    # to keep track of complete epoch
    count = 0 
    while True:
        if count == len(df.index):
            # if the count is matching with the length of df, 
            # the one pass is completed, so reset the generator
            generator.reset()
            break
        count += 1
        # get the data from the generator
        data = generator.next()

        # the data looks like this [[img,img] , [other_cols,other_cols]]  based on the batch size        
        imgs = []
        cols = []
        targets = []

        # iterate the data and append the necessary columns in the corresponding arrays 
        for k in range(batch_size):
            # the first array contains all images
            imgs.append(data[0][k])
      
            # the second array contains all features with last column as class, so [:-1]
            cols.append(data[1][k][:-1])

            # the last column in the second array from data is the class
            targets.append(data[1][k][-1])

        # this will yield the result as you expect.
        yield [imgs,cols], targets  

为验证生成器创建类似的函数。如果需要,请使用 train_test_split 拆分数据帧,并创建2个生成器并覆盖它们。

Create similar function for your validation generator. Use train_test_split to split your dataframe if you need it and create 2 generators and override them.

在<$ c $中传递函数c> model.fit_generator 像这样

model.fit_generator(my_custom_generator(),.....other params)

这篇关于具有增强图像和其他功能的Keras迭代器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆