对CNN使用不同的目标向量 [英] Use different target vectors for CNN
问题描述
我希望使用不同的目标向量(不是标准的一键编码)来训练我的CNN.我的图像数据位于10个不同的文件夹(10个不同的类别)中.如何使用所需的目标向量? flow_from_directory()
输出一键编码的标签数组.我将标签向量存储在字典中.另外,如果有帮助,文件夹的名称就是标签.
I wish to use different target vectors (not the standard one-hot encoded) for training my CNN. My image data lies in 10 different folders (10 different categories). How do I use my desired target vectors? The flow_from_directory()
outputs a one-hot encoded array of labels. I have the label vectors stored in a dictionary. Also, the names of the folders are the labels, if that helps.
推荐答案
您可能知道 ImageDataGenerator
是python生成器(如果您不熟悉python生成器,则可以阅读有关它们的更多信息 flow_from_directory()
生成的目标向量)您可以通过将图像生成器包装在另一个函数中来操纵它的行为.方法如下:
Well as you may know the ImageDataGenerator
in Keras is a python generator (if you are not familiar with python generators you can read more about them here). Since you want to use customized target vectors (and not the ones generated from flow_from_directory()
) you can manipulate the behavior of image generator by wrapping it inside another function. Here is how:
首先,我们需要将自定义目标存储为numpy数组:
First we need to store our custom targets as a numpy array:
# a numpy array containing the custom targets for each class
# custom_target[0] is target vector of class #1 images
# custom_target[1] is target vector of class #2 images
# etc.
custom_targets = your_custom_targets
第二,我们像往常一样创建一个图像生成器,并使用flow_from_directory
从磁盘读取图像.您需要将class_mode
参数设置为'sparse'
以获得每个图像的类索引.此外,您可以将classes
参数设置为包含类名(即目录)的列表.如果您未设置此参数,则将映射到标签索引的类的顺序将为字母数字(即0
表示字母顺序最高的类,依此类推):
Secondly, we create an image generator as usual and use the flow_from_directory
to read images from disk. You need to set class_mode
argument to 'sparse'
to obtain the class index of each image. Further, you can set classes
argument to a list containing the name of classes (i.e. directories). If you don't set this argument the order of the classes, which will map to the label indices, will be alphanumeric (i.e. 0
for class with the highest in alphabetical order, and so on):
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(150, 150),
batch_size=32,
class_mode='sparse') # NOTE: set class_mode to sparse to generate integer indices
(注意:如果未设置classes
参数,请确保custom_target[i]
对应于按字母顺序排列的第i
类.)
(Note: If you don't set classes
argument, make sure that custom_target[i]
corresponds to the i
-th class in alphabetical order.)
现在,我们可以将生成器包装在另一个函数中,并生成成批的图像及其相应的数字标签,以用于生成自己的标签:
Now we can wrap our generator inside another function and generate batches of images and their corresponding numeric labels which we use to generate our own labels:
def custom_generator(generator):
for data, labels in generator:
# get the custom labels corresponding to each class
custom_labels = custom_targets[labels]
yield data, custom_labels
就是这样!现在,我们有了一个自定义生成器,可以像其他任何生成器一样将其传递给fit_generator
(或推断时间为predict_generator
或evaluate_generator
):
And that's it! Now we have a custom generator that we can pass it to fit_generator
(or predict_generator
or evaluate_generator
for inference time) like any other generator:
model.fit_generator(custom_generator(train_generator), # the rest of args)
这篇关于对CNN使用不同的目标向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!