是否可以从Keras的flow_from_directory自动推断出class_weight? [英] Is it possible to automatically infer the class_weight from flow_from_directory in Keras?
问题描述
我有一个不平衡的多类数据集,我想使用fit_generator
中的class_weight
参数根据每个类的图像数量为这些类赋予权重.我正在使用ImageDataGenerator.flow_from_directory
从目录中加载数据集.
I have an imbalanced multi-class dataset and I want to use the class_weight
argument from fit_generator
to give weights to the classes according to the number of images of each class. I'm using ImageDataGenerator.flow_from_directory
to load the dataset from a directory.
是否可以直接从ImageDataGenerator
对象推断出class_weight
自变量?
Is it possible to directly infer the class_weight
argument from the ImageDataGenerator
object?
推荐答案
只是想出了一种方法来实现这一目标.
Just figured out a way of achieving this.
from collections import Counter
train_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(...)
counter = Counter(train_generator.classes)
max_val = float(max(counter.values()))
class_weights = {class_id : max_val/num_images for class_id, num_images in counter.items()}
model.fit_generator(...,
class_weight=class_weights)
train_generator.classes
是每个图像的类的列表.
Counter(train_generator.classes)
创建一个每个类别中图像数量的计数器.
train_generator.classes
is a list of classes for each image.
Counter(train_generator.classes)
creates a counter of the number of images in each class.
请注意,这些权重可能不利于收敛,但是您可以将其用作基于发生率的其他类型加权的基础.
Note that these weights may not be good for convergence, but you can use it as a base for other type of weighting based on occurrence.
此答案的灵感来自于: https://github.com /fchollet/keras/issues/1875#issuecomment-273752868
这篇关于是否可以从Keras的flow_from_directory自动推断出class_weight?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!