Numpy和tensorflow RNN形状表示不匹配 [英] Numpy and tensorflow RNN shape representation mismatch

查看:83
本文介绍了Numpy和tensorflow RNN形状表示不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Tensorflow构建我的第一个RNN.在了解了有关3D输入形状的所有概念之后,我遇到了这个问题.

I'm building my first RNN in tensorflow. After understanding all the concepts regarding the 3D input shape, I came across with this issue.

在我的numpy版本(1.15.4)中,3D数组的形状表示如下:(panel, row, column).我将使每个维度都不同,以便更清晰:

In my numpy version (1.15.4), the shape representation of 3D arrays is the following: (panel, row, column). I will make each dimension different so that it is clearer:

In [1]: import numpy as np                                                                                                                  

In [2]: arr = np.arange(30).reshape((2,3,5))                                                                                                

In [3]: arr                                                                                                                                 
Out[3]: 
array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29]]])

In [4]: arr.shape                                                                                                                           
Out[4]: (2, 3, 5)

In [5]: np.__version__                                                                                                                      
Out[5]: '1.15.4'

我的理解是:我有两个时间步,每个时间步包含3个观测值,每个观测值有5个特征.

但是,在张量流理论"(我认为它强烈基于numpy)中,RNN单元期望形状为[batch_size, timesteps, features]的张量(即n维矩阵),可以将其转换为: numpy的行话".

However, in tensorflow "theory" (which I believe it is strongly based in numpy) RNN cells expect tensors (i.e. just n-dimensional matrices) of shape [batch_size, timesteps, features], which could be translated to: (row, panel, column) in the numpy "jargon".

可以看出,表示形式不匹配,导致在将numpy数据输入占位符时导致错误,在大多数示例和理论中,其定义如下:

As can be seen, the representation doesn't match, leading to errors when feeding numpy data into a placeholder, which in most of the examples and theory is defined like:

x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')

  • np.reshape()不能解决问题,因为它只是重新排列尺寸,但弄乱了数据.

  • np.reshape() doesn't solve the issue because it just rearranges the dimensions, but messes up with the data.

我第一次使用 Dataset API ,但是我在会话中遇到的问题是一次,而不是在Dataset API操作中.

I'm using for the first time the Dataset API, but I encounter the problems once into the session, not in the Dataset API ops.

我正在使用static_rnn方法,并且一切正常,直到必须将数据输入占位符为止,这显然会导致形状错误.

I'm using the static_rnn method, and everything works well until I have to feed the data into the placeholder, which obviously results in a shape error.

我试图将占位符形状更改为shape=[N_TIMESTEPS_X, None, N_FEATURES].但是,我使用的是数据集API,如果将Xplaceholder更改为shape=[N_TIMESTEPS_X, None, N_FEATURES],则在进行初始化时会出错.

I have tried to change the placeholder shape to shape=[N_TIMESTEPS_X, None, N_FEATURES]. HOWEVER, I'm using the dataset API, and I get errors when making the initializer if I change the Xplaceholder to the shape=[N_TIMESTEPS_X, None, N_FEATURES].

因此,总结一下:

  • 第一个问题:具有不同形状表示形式的形状错误.
  • 第二个问题:等同于形状表示时的数据集错误(我认为,如果解决了这个问题,则static_rnn或dynamic_rnn都可以起作用).
  • First problem: Shape errors with different shape representations.
  • Second problem: Dataset error when equating the shape representations (I think that either static_rnn or dynamic_rnn would function if this is resolved).

我的问题是:

¿关于这种不同的表示逻辑,我有什么遗漏之处吗?

¿Is there anything I'm missing in regard to this different representation logic which makes the practice confusing?

¿是否可以解决切换到dynamic_rnn的问题? (尽管我遇到的形状问题与以[N_TIMESTEPS_X,None,N_FEATURES]而不是RNN单元本身的形状馈给数据集API初始化程序有关.

¿Could the solution be attained to switching to dynamic_rnn? (although the problems about the shape I encounter are related to the dataset API initializer being fed with shape [N_TIMESTEPS_X, None, N_FEATURES], not with the RNN cell itself.

非常感谢您的宝贵时间.

Thank you very much for your time.

完整代码:

'''The idea is to create xt, yt, xval and yval. My numpy arrays to 
be fed are of the following shapes: 

The 3D xt array has a shape of: (11, 69579, 74)
The 3D xval array has a shape of: (11, 7732, 74)

The yt array has a shape of: (69579, 3)
The yval array has a shape of: (7732, 3)

'''

N_TIMESTEPS_X = xt.shape[0] ## The stack number
BATCH_SIZE = 256
#N_OBSERVATIONS = xt.shape[1]
N_FEATURES = xt.shape[2]
N_OUTPUTS = yt.shape[1]
N_NEURONS_LSTM = 128 ## Number of units in the LSTMCell 
N_NEURONS_DENSE = 64 ## Number of units in the Dense layer
N_EPOCHS = 600
LEARNING_RATE = 0.1

### Define the placeholders anda gather the data.
train_data = (xt, yt)
validation_data = (xval, yval)

## We define the placeholders as a trick so that we do not break into memory problems, associated with feeding the data directly.
'''As an alternative, you can define the Dataset in terms of tf.placeholder() tensors, and feed the NumPy arrays when you initialize an Iterator over the dataset.'''
batch_size = tf.placeholder(tf.int64)
x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')
y = tf.placeholder(tf.float32, shape=[None, N_OUTPUTS], name='YPlaceholder')

# Creating the two different dataset objects.
train_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE)

# Creating the Iterator type that permits to switch between datasets.
itr = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)
train_init_op = itr.make_initializer(train_dataset)
validation_init_op = itr.make_initializer(val_dataset)

next_features, next_labels = itr.get_next()

### Create the graph 
cellType = tf.nn.rnn_cell.LSTMCell(num_units=N_NEURONS_LSTM, name='LSTMCell')
inputs = tf.unstack(next_features, N_TIMESTEPS_X, axis=0)
'''inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size]'''
RNNOutputs, _ = tf.nn.static_rnn(cell=cellType, inputs=inputs, dtype=tf.float32)
predictionsLayer = tf.layers.dense(inputs=tf.layers.batch_normalization(RNNOutputs[-1]), units=N_NEURONS_DENSE, activation=None, name='Dense_Layer')

### Define the cost function, that will be optimized by the optimizer. 
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=predictionsLayer, labels=next_labels, name='Softmax_plus_Cross_Entropy'))
optimizer_type = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='AdamOptimizer')
optimizer = optimizer_type.minimize(cost)

### Model evaluation 
correctPrediction = tf.equal(tf.argmax(predictionsLayer,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correctPrediction,tf.float32))
#confusionMatrix = tf.confusion_matrix(next_labels, predictionsLayer, num_classes=3, name='ConfMatrix')
N_BATCHES = train_data[0].shape[0] // BATCH_SIZE

## Saving variables so that we can restore them afterwards.
saver = tf.train.Saver()
save_dir = '/home/zmlaptop/Desktop/tfModels/{}_{}'.format(cellType.__class__.__name__, datetime.now().strftime("%Y%m%d%H%M%S"))
os.mkdir(save_dir)
varDict = {'nTimeSteps':N_TIMESTEPS_X, 'BatchSize': BATCH_SIZE, 'nFeatures':N_FEATURES,
           'nNeuronsLSTM':N_NEURONS_LSTM, 'nNeuronsDense':N_NEURONS_DENSE, 'nEpochs':N_EPOCHS,
           'learningRate':LEARNING_RATE, 'optimizerType': optimizer_type.__class__.__name__}
varDicSavingTxt = save_dir + '/varDict.txt'
modelFilesDir = save_dir + '/modelFiles'
os.mkdir(modelFilesDir)

logDir = save_dir + '/TBoardLogs'
os.mkdir(logDir)

acc_summary = tf.summary.scalar('Accuracy', accuracy)
loss_summary = tf.summary.scalar('Cost_CrossEntropy', cost)
summary_merged = tf.summary.merge_all()

with open(varDicSavingTxt, 'w') as outfile:
    outfile.write(repr(varDict))

with tf.Session() as sess:

    tf.set_random_seed(2)
    sess.run(tf.global_variables_initializer())
    train_writer = tf.summary.FileWriter(logDir + '/train', sess.graph)
    validation_writer = tf.summary.FileWriter(logDir + '/validation')

    # initialise iterator with train data
    sess.run(train_init_op, feed_dict = {x : train_data[0], y: train_data[1], batch_size: BATCH_SIZE})

    print('¡Training starts!')
    for epoch in range(N_EPOCHS):

        batchAccList = []
        tot_loss = 0

        for batch in range(N_BATCHES):

            optimizer_output, loss_value, summary = sess.run([optimizer, cost, summary_merged])
            accBatch = sess.run(accuracy)
            tot_loss += loss_value
            batchAccList.append(accBatch)

            if batch % 10 == 0:

                train_writer.add_summary(summary, batch)

        epochAcc = tf.reduce_mean(batchAccList)

        if epoch%10 == 0:

            print("Epoch: {}, Loss: {:.4f}, Accuracy: {}".format(epoch, tot_loss / N_BATCHES, epochAcc))

    #confM = sess.run(confusionMatrix)
    #confDic = {'confMatrix': confM}
    #confTxt = save_dir + '/confMDict.txt'
    #with open(confTxt, 'w') as outfile:
    #    outfile.write(repr(confDic))
    #print(confM)

    # initialise iterator with validation data
    sess.run(validation_init_op, feed_dict = {x : validation_data[0], y: validation_data[1], batch_size:len(validation_data[0])})
    print('Validation Loss: {:4f}, Validation Accuracy: {}'.format(sess.run(cost), sess.run(accuracy)))
    summary_val = sess.run(summary_merged)
    validation_writer.add_summary(summary_val)

    saver.save(sess, modelFilesDir)

推荐答案

关于这个不同之处,我有什么想念的吗? 表示逻辑,这会使实践产生混乱?

Is there anything I'm missing in regard to this different representation logic which makes the practice confusing?

实际上,您在static_rnndynamic_rnn的输入形状上犯了一个错误. static_rnn的输入形状为[timesteps,batch_size, features](链接) ,这是[batch_size,要素]形状的2D张量的列表.但是dynamic_rnn的输入形状是[timesteps,batch_size, features][batch_size,timesteps, features],具体取决于time_major是True还是False(

In fact, you made a mistake about the input shapes of static_rnn and dynamic_rnn. The input shape of static_rnn is [timesteps,batch_size, features](link),which is a list of 2D tensors of shape [batch_size, features]. But The input shape of dynamic_rnn is either [timesteps,batch_size, features] or [batch_size,timesteps, features] depending on time_major is True or False(link).

能否实现切换到dynamic_rnn的解决方案?

Could the solution be attained to switching to dynamic_rnn?

关键不是您使用的是static_rnn还是dynamic_rnn,而是您的数据形状与所需的形状匹配.占位符的一般格式就像您的代码是[None, N_TIMESTEPS_X, N_FEATURES].使用数据集API也很方便. 您可以改用transpose()(链接) reshape().transpose()将置换数组的尺寸,并且不会弄乱数据.

The key is not that you use static_rnn or dynamic_rnn, but that your data shape matches the required shape. The general format of placeholder is like your code is [None, N_TIMESTEPS_X, N_FEATURES]. It's also convenient for you to use dataset API. You can use transpose()(link) instead of reshape().transpose() will permute the dimensions of an array and won't messes up with the data.

因此您的代码需要修改.

So your code needs to be modified.

# permute the dimensions
xt = xt.transpose([1,0,2])
xval = xval.transpose([1,0,2])

# adjust shape,axis=1 represents timesteps
inputs = tf.unstack(next_features,  axis=1)

其他错误与rnn形状无关.

Other errors should have nothing to do with rnn shape.

这篇关于Numpy和tensorflow RNN形状表示不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆