TensorFlow/Keras-预期global_average_pooling2d_1_input的形状为(1,1,2048),但数组的形状为(7,7,2048) [英] TensorFlow/Keras - expected global_average_pooling2d_1_input to have shape (1, 1, 2048) but got array with shape (7, 7, 2048)

查看:102
本文介绍了TensorFlow/Keras-预期global_average_pooling2d_1_input的形状为(1,1,2048),但数组的形状为(7,7,2048)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对TensorFlow和图像分类还很陌生,所以我可能缺少关键知识,这可能就是我面临此问题的原因.

我已经在TensorFlow中构建了一个ResNet50模型,目的是使用ImageNet库对狗品种进行图像分类,并且我已经成功地训练了可以检测各种狗品种的神经网络.

我现在正想将一只狗的随机图像传递给我的模型,以便它吐出关于它认为狗的品种的输出.但是,当我运行此函数dog_breed_predictor("<file path to image>")时,当它尝试执行行Resnet50_model.predict(bottleneck_feature)时出现错误expected global_average_pooling2d_1_input to have shape (1, 1, 2048) but got array with shape (7, 7, 2048),我不知道如何解决此问题.

这是代码.我提供了我认为与问题有关的所有信息.

import cv2
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from tqdm import tqdm

from sklearn.datasets import load_files
np_utils = tf.keras.utils

# define function to load train, test, and validation datasets
def load_dataset(path):
    data = load_files(path)
    dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
    return dog_files, dog_targets

# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/dogImages/train')
valid_files, valid_targets = load_dataset('dogImages/dogImages/valid')
test_files, test_targets = load_dataset('dogImages/dogImages/test')

#define Resnet50 model
Resnet50_model = ResNet50(weights="imagenet")

def path_to_tensor(img_path):
    #loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(224, 224))
    #convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
    x = image.img_to_array(img)
    #convert 3D tensor into 4D tensor with shape (1, 224, 224, 3)
    return np.expand_dims(x, axis=0)

from keras.applications.resnet50 import preprocess_input, decode_predictions

def ResNet50_predict_labels(img_path):
    #returns prediction vector for image located at img_path
    img = preprocess_input(path_to_tensor(img_path))
    return np.argmax(Resnet50_model.predict(img))

###returns True if a dog is detected in the image stored at img_path
def dog_detector(img_path):
    prediction = ResNet50_predict_labels(img_path)
    return ((prediction <= 268) & (prediction >= 151))

###Obtain bottleneck features from another pre-trained CNN
bottleneck_features = np.load("bottleneck_features/DogResnet50Data.npz")
train_DogResnet50 = bottleneck_features["train"]
valid_DogResnet50 = bottleneck_features["valid"]
test_DogResnet50 = bottleneck_features["test"]

###Define your architecture
Resnet50_model = tf.keras.Sequential()
Resnet50_model.add(tf.keras.layers.GlobalAveragePooling2D(input_shape=train_DogResnet50.shape[1:]))
Resnet50_model.add(tf.contrib.keras.layers.Dense(133, activation="softmax"))

Resnet50_model.summary()

###Compile the model
Resnet50_model.compile(loss="categorical_crossentropy", optimizer="rmsprop", metrics=["accuracy"])
###Train the model
checkpointer = tf.keras.callbacks.ModelCheckpoint(filepath="saved_models/weights.best.ResNet50.hdf5",
                                                 verbose=1, save_best_only=True)

Resnet50_model.fit(train_DogResnet50, train_targets,
                  validation_data=(valid_DogResnet50, valid_targets),
                  epochs=20, batch_size=20, callbacks=[checkpointer])

###Load the model weights with the best validation loss.
Resnet50_model.load_weights("saved_models/weights.best.ResNet50.hdf5")

###Calculate classification accuracy on the test dataset
Resnet50_predictions = [np.argmax(Resnet50_model.predict(np.expand_dims(feature, axis=0))) for feature in test_DogResnet50]

#Report test accuracy
test_accuracy = 100*np.sum(np.array(Resnet50_predictions)==np.argmax(test_targets, axis=1))/len(Resnet50_predictions)
print("Test accuracy: %.4f%%" % test_accuracy)

def extract_Resnet50(tensor):
    from keras.applications.resnet50 import ResNet50, preprocess_input
    return ResNet50(weights='imagenet', include_top=False).predict(preprocess_input(tensor))

def dog_breed(img_path):
    #extract bottleneck features
    bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
    #obtain predicted vector
    predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
    #return dog breed that is predicted by the model
    return dog_names[np.argmax(predicted_vector)]

def dog_breed_predictor(img_path):
    #determine the predicted dog breed
    breed = dog_breed(img_path)
    #display the image
    img = cv2.imread(img_path)
    cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(cv_rgb)
    plt.show()
    #display relevant predictor result
    if dog_detector(img_path):
        print("This is a dog and its breed is: " + str(breed))
    elif face_detector(img_path):
        print("This is a human but it looks like a: " + str(breed))
    else:
        print("I don't know what this is.")

dog_breed_predictor("dogImages/dogImages/train/016.Beagle/Beagle_01126.jpg")

我输入到函数中的图像来自用于训练模型的同一数据集-我想看看模型是否按预期工作,所以这个错误使它更加令人困惑.我可能做错了什么?

解决方案

感谢 nessuno '在我的协助下,我找出了问题所在.问题确实出在ResNet50pooling层上.

上面我的脚本中的以下代码:

return ResNet50(weights='imagenet',
                include_top=False).predict(preprocess_input(tensor))

返回(1, 7, 7, 2048)的形状(虽然,我不能完全理解 为什么 ).为了解决这个问题,我这样添加了参数pooling="avg":

return ResNet50(weights='imagenet',
                include_top=False,
                pooling="avg").predict(preprocess_input(tensor))

这反而会返回(1, 2048)的形状(同样,请承认,我不知道 为什么 .)

但是,模型仍希望使用4D形状.为了解决这个问题,我在dog_breed()函数中添加了以下代码:

print(bottleneck_feature.shape) #returns (1, 2048)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.

,这将返回(1, 1, 1, 1, 2048)的形状.出于某种原因,当我仅再添加2个尺寸时,该模型仍然抱怨它是3D形状,但是当我添加第3个尺寸时就停止了(这很奇怪,我想进一步了解这是为什么.).

总的来说,我的dog_breed()函数来自:

def dog_breed(img_path):
    #extract bottleneck features
    bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
    #obtain predicted vector
    predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
    #return dog breed that is predicted by the model
    return dog_names[np.argmax(predicted_vector)]

对此:

def dog_breed(img_path):
    #extract bottleneck features
    bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
    print(bottleneck_feature.shape) #returns (1, 2048)
    bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
    bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
    bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
    print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.
    #obtain predicted vector
    predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
    #return dog breed that is predicted by the model
    return dog_names[np.argmax(predicted_vector)]

确保将参数pooling="avg"添加到我对ResNet50的调用中.

I'm fairly new to TensorFlow and Image Classification, so I may be missing key knowledge and is probably why I'm facing this issue.

I've built a ResNet50 model in TensorFlow for the purpose of image classification of Dog Breeds using the ImageNet library and I have successfully trained a neural network which can detect various Dog Breeds.

I'm now at the point in which I would like to pass a random image of a dog to my model for it to spit out an output on what it thinks the dog breed is. However, when I run this function, dog_breed_predictor("<file path to image>"), I get the error expected global_average_pooling2d_1_input to have shape (1, 1, 2048) but got array with shape (7, 7, 2048) when it tries to execute the line Resnet50_model.predict(bottleneck_feature) and I don't know how to get around this.

Here's the code. I've provided all that I feel is relevant to the problem.

import cv2
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from tqdm import tqdm

from sklearn.datasets import load_files
np_utils = tf.keras.utils

# define function to load train, test, and validation datasets
def load_dataset(path):
    data = load_files(path)
    dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
    return dog_files, dog_targets

# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/dogImages/train')
valid_files, valid_targets = load_dataset('dogImages/dogImages/valid')
test_files, test_targets = load_dataset('dogImages/dogImages/test')

#define Resnet50 model
Resnet50_model = ResNet50(weights="imagenet")

def path_to_tensor(img_path):
    #loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(224, 224))
    #convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
    x = image.img_to_array(img)
    #convert 3D tensor into 4D tensor with shape (1, 224, 224, 3)
    return np.expand_dims(x, axis=0)

from keras.applications.resnet50 import preprocess_input, decode_predictions

def ResNet50_predict_labels(img_path):
    #returns prediction vector for image located at img_path
    img = preprocess_input(path_to_tensor(img_path))
    return np.argmax(Resnet50_model.predict(img))

###returns True if a dog is detected in the image stored at img_path
def dog_detector(img_path):
    prediction = ResNet50_predict_labels(img_path)
    return ((prediction <= 268) & (prediction >= 151))

###Obtain bottleneck features from another pre-trained CNN
bottleneck_features = np.load("bottleneck_features/DogResnet50Data.npz")
train_DogResnet50 = bottleneck_features["train"]
valid_DogResnet50 = bottleneck_features["valid"]
test_DogResnet50 = bottleneck_features["test"]

###Define your architecture
Resnet50_model = tf.keras.Sequential()
Resnet50_model.add(tf.keras.layers.GlobalAveragePooling2D(input_shape=train_DogResnet50.shape[1:]))
Resnet50_model.add(tf.contrib.keras.layers.Dense(133, activation="softmax"))

Resnet50_model.summary()

###Compile the model
Resnet50_model.compile(loss="categorical_crossentropy", optimizer="rmsprop", metrics=["accuracy"])
###Train the model
checkpointer = tf.keras.callbacks.ModelCheckpoint(filepath="saved_models/weights.best.ResNet50.hdf5",
                                                 verbose=1, save_best_only=True)

Resnet50_model.fit(train_DogResnet50, train_targets,
                  validation_data=(valid_DogResnet50, valid_targets),
                  epochs=20, batch_size=20, callbacks=[checkpointer])

###Load the model weights with the best validation loss.
Resnet50_model.load_weights("saved_models/weights.best.ResNet50.hdf5")

###Calculate classification accuracy on the test dataset
Resnet50_predictions = [np.argmax(Resnet50_model.predict(np.expand_dims(feature, axis=0))) for feature in test_DogResnet50]

#Report test accuracy
test_accuracy = 100*np.sum(np.array(Resnet50_predictions)==np.argmax(test_targets, axis=1))/len(Resnet50_predictions)
print("Test accuracy: %.4f%%" % test_accuracy)

def extract_Resnet50(tensor):
    from keras.applications.resnet50 import ResNet50, preprocess_input
    return ResNet50(weights='imagenet', include_top=False).predict(preprocess_input(tensor))

def dog_breed(img_path):
    #extract bottleneck features
    bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
    #obtain predicted vector
    predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
    #return dog breed that is predicted by the model
    return dog_names[np.argmax(predicted_vector)]

def dog_breed_predictor(img_path):
    #determine the predicted dog breed
    breed = dog_breed(img_path)
    #display the image
    img = cv2.imread(img_path)
    cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(cv_rgb)
    plt.show()
    #display relevant predictor result
    if dog_detector(img_path):
        print("This is a dog and its breed is: " + str(breed))
    elif face_detector(img_path):
        print("This is a human but it looks like a: " + str(breed))
    else:
        print("I don't know what this is.")

dog_breed_predictor("dogImages/dogImages/train/016.Beagle/Beagle_01126.jpg")

The image I'm feeding into my function is from the same dataset that was used to train the model - I wanted to see myself if the model is working as intended - so this error makes it extra confusing. What could I be doing wrong?

解决方案

Thanks to nessuno's assistance, I figured out the issue. The problem was indeed with the pooling layer of ResNet50.

The following code in my script above:

return ResNet50(weights='imagenet',
                include_top=False).predict(preprocess_input(tensor))

returns a shape of (1, 7, 7, 2048) (admittedly though, I do not fully understand why). To get around this, I added in the parameter pooling="avg" as so:

return ResNet50(weights='imagenet',
                include_top=False,
                pooling="avg").predict(preprocess_input(tensor))

This instead returns a shape of (1, 2048) (again, admittedly, I do not know why.)

However, the model still expects a 4-D shape. To get around this I added in the following code in my dog_breed() function:

print(bottleneck_feature.shape) #returns (1, 2048)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.

and this returns a shape of (1, 1, 1, 1, 2048). For some reason, the model still complained it was a 3D shape when I only added 2 more dimensions, but stopped when I added a 3rd (this is peculiar, and I would like to find out more about why this is.).

So overall, my dog_breed() function went from:

def dog_breed(img_path):
    #extract bottleneck features
    bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
    #obtain predicted vector
    predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
    #return dog breed that is predicted by the model
    return dog_names[np.argmax(predicted_vector)]

to this:

def dog_breed(img_path):
    #extract bottleneck features
    bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
    print(bottleneck_feature.shape) #returns (1, 2048)
    bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
    bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
    bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
    print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.
    #obtain predicted vector
    predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
    #return dog breed that is predicted by the model
    return dog_names[np.argmax(predicted_vector)]

whilst ensuring the parameter pooling="avg" is added to my call to ResNet50.

这篇关于TensorFlow/Keras-预期global_average_pooling2d_1_input的形状为(1,1,2048),但数组的形状为(7,7,2048)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆