多标签图像分类 [英] Multi-Label Image Classification
问题描述
我已经尝试过自己,但无法达到最终目的,这就是为什么在此处发布内容,请指导我.
I tried myself but couldn't reach the final point that's why posting here, please guide me.
- 我正在从事多标签图像分类,并且略有不同的情况.其实我很困惑,我们如何将标签及其属性与Id等映射,所以我们可以将其用于培训和测试.
-
这是我正在使用的代码
- I am working in multi-label image classification and have slightly different scenarios. Actually I am confused, how we will map labels and their attribute with Id etc So we can use for training and testing.
Here is code on which I am working
import os
import numpy as np
import pandas as pd
from keras.utils import to_categorical
from collections import Counter
from keras.callbacks import Callback
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from matplotlib import pyplot
from tensorflow.keras import backend
def create_tag_mapping(mapping_csv):
labels = set()
for i in range(len(mapping_csv)):
tags = mapping_csv['Labels'][i].split(' ')
labels.update(tags)
labels = list(labels)
labels.sort()
labels_map = {labels[i]:i for i in range(len(labels))}
inv_labels_map = {i:labels[i] for i in range(len(labels))}
return labels_map, inv_labels_map
# create a mapping of filename to tags
def create_file_mapping(mapping_csv):
mapping = dict()
for i in range(len(mapping_csv)):
name, tags = mapping_csv['Id'][i], mapping_csv['Labels'][i]
mapping[name] = tags.split(' ')
return mapping
# create a one hot encoding for one list of tags
def one_hot_encode(tags, mapping):
# create empty vector
encoding = np.zeros(len(mapping), dtype='uint8')
# mark 1 for each tag in the vector
for tag in tags:
encoding[mapping[tag]] = 1
return encoding
def load_dataset(path, file_mapping, tag_mapping):
photos, targets = list(), list()
# enumerate files in the directory
for filename in os.listdir(path):
# load image
photo = load_img(path + filename, target_size=(760,415))
# convert to numpy array
photo = img_to_array(photo, dtype='uint8')
# get tags
tags = file_mapping[filename[:-4]]
# one hot encode tags
target = one_hot_encode(tags, tag_mapping)
# store
photos.append(photo)
targets.append(target)
X = np.asarray(photos, dtype='uint8')
y = np.asarray(targets, dtype='uint8')
return X, y
trainingLabels = 'labels.csv'
# load the mapping file
mapping_csv = pd.read_csv(trainingLabels)
# create a mapping of tags to integers
tag_mapping, _ = create_tag_mapping(mapping_csv)
# create a mapping of filenames to tag lists
file_mapping = create_file_mapping(mapping_csv)
# load the png images
folder = 'dataset/'
X, y = load_dataset(folder, file_mapping, tag_mapping)
print(X.shape, y.shape)
trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.3, random_state=1)
print(trainX.shape, trainY.shape, testX.shape, testY.shape)
img_x,img_y=760,415
trainX=trainX.reshape(trainX.shape[0], img_x,img_y,3)
testX=testX.reshape(testX.shape[0], img_x,img_y,3)
trainX=trainX.astype('float32')
testX=testX.astype('float32')
trainX /= 255
testX /=255
trainY=to_categorical(trainY,3)
testY=to_categorical(testY,3)
print(trainX.shape)
print(trainY.shape)
model = Sequential()
model.add(Conv2D(32, (5, 5), strides=(1,1), activation='relu', input_shape=(img_x, img_y,3)))
model.add(MaxPooling2D((2, 2), strides=(2,2)))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(3, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history=model.fit(trainX, trainY, batch_size=2, epochs=5, verbose=1)
plt.plot(history.history['acc'])
plt.plot(history.history['loss'])
plt.title('Accuracy and loss')
plt.xlabel('epoch')
plt.ylabel('accuracy/loss')
plt.legend(['Accuracy','loss'],loc='upper left')
plt.show()
score=model.evaluate(testX,testY,verbose=0)
print('test loss',score[0])
print('test accuracy',score[1])
我已经附加了一个图像文件,它将清楚地显示我的问题.
I have attached an image file, that will give a clear picture of my problem.
因为如果我们遵循这些
- https://towardsdatascience.com/journey到多标签分类中心-384c40229bff
- https://www.analyticsvidhya.com/blog/2019/04/predicting-movie-genres-nlp-multi-label-classification/
等它们在每个图像上都有多个标签,但就我而言,我有多个标签及其属性.
etc. They have multi labels against each image but in my case, I have multilabel plus their attributes.
推荐答案
基于上述讨论.这是上述问题的解决方案.如前所述,我们总共有5个标签,每个标签还有另外三个标签,例如(L,M,H).我们可以通过这种方式进行编码
Base on the above discussion. Here is the solution for the above problem. As I mentioned we have a total of 5 labels and each label have further three tags like (L, M, H) We can perform encoding in this way
# create a one hot encoding for one list of tags
def custom_encode(tags, mapping):
# create empty vector
encoding=[]
for tag in tags:
if tag == 'L':
encoding.append([1,0,0])
elif tag == 'M':
encoding.append([0,1,0])
else:
encoding.append([0,0,1])
return encoding
所以编码后的y矢量看起来像
So encoded y-vector will look like
**Labels Tags Encoded Tags**
Label1 ----> [L,L,L,M,H] ---> [ [1,0,0], [1,0,0], [1,0,0], [0,1,0], [0,0,1] ]
Label2 ----> [L,H,L,M,H] ---> [ [1,0,0], [0,0,1], [1,0,0], [0,1,0], [0,0,1] ]
Label3 ----> [L,M,L,M,H] ---> [ [1,0,0], [0,1,0], [1,0,0], [0,1,0], [0,0,1] ]
Label4 ----> [M,M,L,M,H] ---> [ [0,1,0], [0,1,0], [1,0,0], [0,1,0], [0,0,1] ]
Label5 ----> [M,L,L,M,H] ---> [ [0,1,0], [1,0,0], [1,0,0], [0,1,0], [0,0,1] ]
最后一层就像
model.add(Dense(15)) #because we have total 5 labels and each has 3 tags so 15 neurons will be on final layer
model.add(Reshape((5,3))) # each 5 have further 3 tags we need to reshape it
model.add(Activation('softmax'))
这篇关于多标签图像分类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!