Tensorflow与Keras中的RNN，tf.nn.dynamic_rnn()的折旧 [英] RNN in Tensorflow vs Keras, depreciation of tf.nn.dynamic_rnn()

查看：215 发布时间：2020/4/25 10:09:16 tensorflow keras tf.keras

本文介绍了Tensorflow与Keras中的RNN，tf.nn.dynamic_rnn()的折旧的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的问题是: tf.nn.dynamic_rnn 和确实与文档中所述完全相同?

My question is: Are the tf.nn.dynamic_rnn and keras.layers.RNN(cell) truly identical as stated in docs?

我正在计划构建RNN，但是，似乎 tf.nn.dynamic_rnn 对Keras表示敬意.

I am planning on building an RNN, however, it seems that tf.nn.dynamic_rnn is depricated in favour of Keras.

其中特别指出:

警告:不建议使用此功能.以后将其删除版本.更新说明:请使用keras.layers.RNN(cell)，等同于该API

Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Please use keras.layers.RNN(cell), which is equivalent to this API

但是在可变长度的情况下，我看不出API的等效性！

But I don't see how the APIs are equivalent, in the case of variable sequence lengths!

在原始TF中，我们可以指定形状为(batch_size, seq_lengths)的张量.这样，如果我们的序列为[0, 1, 2, 3, 4]，并且批处理中最长的序列的大小为10，则可以用0和[0, 1, 2, 3, 4, 0, 0, 0, 0, 0]填充它，我们可以说seq_length=5处理[0, 1, 2, 3, 4].

In raw TF, we can specify a tensor of shape (batch_size, seq_lengths). This way, if our sequence is [0, 1, 2, 3, 4] and the longest sequence in the batch is of size 10, we can pad it with 0s and [0, 1, 2, 3, 4, 0, 0, 0, 0, 0], we can say seq_length=5 to process [0, 1, 2, 3, 4].

但是，在Keras中，它不是这样工作的！我们可以做的是在先前的图层中指定mask_zero=True，例如嵌入层.这也将掩盖第一个零！

However, in Keras, this is not how it works! What we can do, is specify the mask_zero=True in previous Layers, e.g. the Embedding Layer. This will also mask the 1st zero!

我可以通过在整个向量上加一个来解决它，但这就是我在使用tft.compute_vocabulary()处理后需要做的额外预处理，该处理将词汇表单词映射到0个索引向量.

I can go around it by adding ones to the whole vector, but then thats extra preprocessing that I need to do after processing using tft.compute_vocabulary(), which maps vocabulary words to 0 indexed vector.

推荐答案

否，但它们也(或可以使之成为)也没有太大不同.

No, but they are (or can be made to be) not so different either.

tf.nn.dynamic_rnn在序列以0s结束后替换元素.据我所知，这不能用tf.keras.layers.*复制，但是您可以通过RNN(Masking(...)方法获得类似的行为:它只是停止计算并携带最后的输出并转发状态.您将获得与从tf.nn.dynamic_rnn获得的输出相同(非填充)的输出.

tf.nn.dynamic_rnn replaces elements after the sequence end with 0s. This cannot be replicated with tf.keras.layers.* as far as I know, but you can get a similar behaviour with RNN(Masking(...) approach: it simply stops the computation and carries the last outputs and states forward. You will get the same (non-padding) outputs as those obtained from tf.nn.dynamic_rnn.

这是一个最小的工作示例，展示了 tf.nn.dynamic_rnn 和 tf.keras.layers.GRU ，无论是否使用 tf.keras.layers.Masking 层.

Here is a minimal working example demonstrating the differences between tf.nn.dynamic_rnn and tf.keras.layers.GRU with and without the use of tf.keras.layers.Masking layer.

import numpy as np
import tensorflow as tf

test_input = np.array([
    [1, 2, 1, 0, 0],
    [0, 1, 2, 1, 0]
], dtype=int)
seq_length = tf.constant(np.array([3, 4], dtype=int))

emb_weights = (np.ones(shape=(3, 2)) * np.transpose([[0.37, 1, 2]])).astype(np.float32)
emb = tf.keras.layers.Embedding(
    *emb_weights.shape,
    weights=[emb_weights],
    trainable=False
)
mask = tf.keras.layers.Masking(mask_value=0.37)
rnn = tf.keras.layers.GRU(
    1,
    return_sequences=True,
    activation=None,
    recurrent_activation=None,
    kernel_initializer='ones',
    recurrent_initializer='zeros',
    use_bias=True,
    bias_initializer='ones'
)


def old_rnn(inputs):
    rnn_outputs, rnn_states = tf.nn.dynamic_rnn(
        rnn.cell,
        inputs,
        dtype=tf.float32,
        sequence_length=seq_length
    )
    return rnn_outputs


x = tf.keras.layers.Input(shape=test_input.shape[1:])
m0 = tf.keras.Model(inputs=x, outputs=emb(x))
m1 = tf.keras.Model(inputs=x, outputs=rnn(emb(x)))
m2 = tf.keras.Model(inputs=x, outputs=rnn(mask(emb(x))))

print(m0.predict(test_input).squeeze())
print(m1.predict(test_input).squeeze())
print(m2.predict(test_input).squeeze())

sess = tf.keras.backend.get_session()
print(sess.run(old_rnn(mask(emb(x))), feed_dict={x: test_input}).squeeze())

m0的输出在那里显示了应用嵌入层的结果. 请注意，根本没有零条目:

The outputs from m0 are there to show the result of applying the embedding layer. Note that there are no zero entries at all:

[[[1.   1.  ]    [[0.37 0.37]
  [2.   2.  ]     [1.   1.  ]
  [1.   1.  ]     [2.   2.  ]
  [0.37 0.37]     [1.   1.  ]
  [0.37 0.37]]    [0.37 0.37]]]

现在是m1，m2和old_rnn体系结构的实际输出:

Now here are the actual outputs from the m1, m2 and old_rnn architectures:

m1: [[  -6.  -50. -156. -272.7276 -475.83362]
     [  -1.2876 -9.862801 -69.314 -213.94202 -373.54672 ]]
m2: [[  -6.  -50. -156. -156. -156.]
     [   0.   -6.  -50. -156. -156.]]
old [[  -6.  -50. -156.    0.    0.]
     [   0.   -6.  -50. -156.    0.]]

摘要

旧的tf.nn.dynamic_rnn用于用零掩盖填充元素.
没有屏蔽的新RNN层在填充元素上运行，就好像它们是数据一样.
新的rnn(mask(...))方法只是简单地停止计算，并传送最后的输出和状态.请注意，我通过这种方法获得的(非填充)输出与tf.nn.dynamic_rnn中的输出完全相同.

Summary

The old tf.nn.dynamic_rnn used to mask padding elements with zeros.
The new RNN layers without masking run over the padding elements as if they were data.
The new rnn(mask(...)) approach simply stops the computation and carries the last outputs and states forward. Note that the (non-padding) outputs that I obtained for this approach are exactly the same as those from tf.nn.dynamic_rnn.

无论如何，我无法涵盖所有可能的极端情况，但我希望您可以使用此脚本来进一步了解问题.

Anyway, I cannot cover all possible edge cases, but I hope that you can use this script to figure things out further.

这篇关于Tensorflow与Keras中的RNN，tf.nn.dynamic_rnn()的折旧的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Tensorflow与Keras中的RNN，tf.nn.dynamic_rnn()的折旧 [英] RNN in Tensorflow vs Keras, depreciation of tf.nn.dynamic_rnn()

问题描述

推荐答案

摘要

Summary

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Tensorflow与Keras中的RNN，tf.nn.dynamic_rnn()的折旧 [英] RNN in Tensorflow vs Keras, depreciation of tf.nn.dynamic_rnn()

问题描述

推荐答案

摘要

Summary

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭