是否可以使用矢量方法来移动存储在 numpy ndarray 中的图像以进行数据增强? [英] Is it possible to use vector methods to shift images stored in a numpy ndarray for data augmentation?

查看:63
本文介绍了是否可以使用矢量方法来移动存储在 numpy ndarray 中的图像以进行数据增强?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:这是 Aurelien Geron 的教科书 Hands on Machine Learning 中的练习题之一.

Background: This is one of the exercise problems in the text book Hands on Machine Learning by Aurelien Geron.

问题是:编写一个函数,可以将 MNIST 图像在任何方向(左、右、上、下)移动一个像素.然后对于训练集中的每个图像,创建四个移位副本(每个方向一个)并将它们添加到训练集中.

The question is: Write a function that can shift an MNIST image in any direction (left, right, up, down) by one pixel. Then for each image in the training set, create four shifted copies (one per direction) and add them to the training set.

我的思考过程:

  1. 我在 X_train 中有一个大小为 (59500, 784) 的 numpy 数组(每行是一个 (28,28) 图像).对于 X_train 的每一行:
  1. I have a numpy array of size (59500, 784) in X_train (Each row is a (28,28) image). For each row of X_train:
  1. 将行重整为 28,28
  2. 对于每个方向(上、下、左、右):
  1. Reshape row to 28,28
  2. For each direction (up, down, left, right):
  1. 重塑为 784,0
  2. 写入空数组

  • 将新数组附加到 X_train
  • 我的代码:

    import numpy as np
    from scipy.ndimage.interpolation import shift
    
    def shift_and_append(X, n):
        x_arr = np.zeros((1, 784))
        for i in range(n):
            for j in range(-1,2):
                for k in range(-1,2):
                    if j!=k and j!=-k:
                        x_arr = np.append(x_arr, shift(X[i,:].reshape(28,28), [j, k]).reshape(1, 784), axis=0)
        return np.append(X, x_arr[1:,:], axis=0)
    
    X_train_new = shift_and_append(X_train, X_train.shape[0])
    y_train_new = np.append(y_train, np.repeat(y_train, 4), axis=0)
    

    运行需要很长时间.我觉得这是蛮力强迫它.有没有一种有效的类似向量的方法来实现这一点?

    It takes a long time to run. I feel this is brute forcing it. Is there an efficient vector like method to achieve this?

    推荐答案

    3 个嵌套的 for 循环和一个 if 条件,而重塑和追加显然不是一个好主意;numpy.roll 以矢量方式完美地完成了这项工作:

    3 nested for loops with an if condition while reshaping and appending is clearly not a good idea; numpy.roll does the job beautifully in a vector way:

    import numpy as np
    import matplotlib.pyplot as plt 
    from keras.datasets import mnist
    (x_train, y_train), (x_test, y_test) = mnist.load_data()
    x_train.shape
    # (60000, 28, 28)
    
    # plot an original image
    plt.gray() 
    plt.matshow(x_train[0]) 
    plt.show() 
    

    先演示一下操作:

    # one pixel down:
    x_down = np.roll(x_train[0], 1, axis=0)
    plt.gray() 
    plt.matshow(x_down) 
    plt.show() 
    

    # one pixel up:
    x_up = np.roll(x_train[0], -1, axis=0)
    plt.gray() 
    plt.matshow(x_up) 
    plt.show() 
    

    # one pixel left:
    x_left = np.roll(x_train[0], -1, axis=1)
    plt.gray() 
    plt.matshow(x_left) 
    plt.show() 
    

    # one pixel right:
    x_right = np.roll(x_train[0], 1, axis=1)
    plt.gray() 
    plt.matshow(x_right) 
    plt.show() 
    

    确定这一点后,我们可以简单地生成所有训练图像的正确"版本

    Having established that, we can generate, say, "right" versions of all the training images simply by

    x_all_right = [np.roll(x, 1, axis=1) for x in x_train]
    

    其他三个方向也类似.

    让我们确认 x_all_right 中的第一张图片确实是我们想要的:

    Let's confirm that the first image in x_all_right is indeed what we want:

    plt.gray() 
    plt.matshow(x_all_right[0]) 
    plt.show()
    

    您甚至可以避免使用最后一个列表推导式,转而使用纯 Numpy 代码,如

    You can even avoid the last list comprehension in favor of pure Numpy code, as

    x_all_right = np.roll(x_train, 1, axis=2)
    

    效率更高,但直观性稍差(只需采用相应的单图像命令版本并将 axis 增加 1).

    which is more efficient, although slightly less intuitive (just take the respective single-image command versions and increase axis by 1).

    这篇关于是否可以使用矢量方法来移动存储在 numpy ndarray 中的图像以进行数据增强?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆