与 Tensorflow 中的常规 LSTMCell 相比，使用 CudnnLSTM 进行训练时的不同结果 [英] Different results while training with CudnnLSTM compared to regular LSTMCell in Tensorflow

查看：43 发布时间：2021/9/5 19:03:04 python tensorflow

本文介绍了与 Tensorflow 中的常规 LSTMCell 相比，使用 CudnnLSTM 进行训练时的不同结果的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在 Python 中使用 Tensorflow 训练 LSTM 网络，并想切换到 tf.contrib.cudnn_rnn.CudnnLSTM 以进行更快的训练.我所做的被替换了

I'm training an LSTM network with Tensorflow in Python and wanted to switch to tf.contrib.cudnn_rnn.CudnnLSTM for faster training. What I did is replaced

cells = tf.nn.rnn_cell.LSTMCell(self.num_hidden) 
initial_state = cells.zero_state(self.batch_size, tf.float32)
rnn_outputs, _ = tf.nn.dynamic_rnn(cells, my_inputs, initial_state = initial_state)

与

lstm = tf.contrib.cudnn_rnn.CudnnLSTM(1, self.num_hidden)
rnn_outputs, _ = lstm(my_inputs)

我正在体验显着的训练加速(超过 10 倍)，但同时我的性能指标下降.使用 LSTMCell 时，二元分类的 AUC 为 0.741，使用 CudnnLSTM 时为 0.705.我想知道我是否做错了什么，或者是这两者之间的实现差异，这就是如何在继续使用 CudnnLSTM 的同时恢复我的性能的情况.

I'm experiencing significant training speedup (more than 10x times), but at the same time my performance metric goes down. AUC on a binary classification is 0.741 when using LSTMCell and 0.705 when using CudnnLSTM. I'm wondering if I'm doing something wrong or it's the difference in implementation between those two and it's that's the case how to get my performance back while keep using CudnnLSTM.

训练数据集有 15,337 个不同长度的序列(最多几百个元素)，每批中用零填充以使其长度相同.所有代码都相同，包括 TF 数据集 API 管道和所有评估指标.我对每个版本运行了几次，在所有情况下它都围绕这些值收敛.

The training dataset has 15,337 sequences of varying length (up to few hundred elements) that are padded with zeros to be the same length in each batch. All the code is the same including the TF Dataset API pipeline and all evaluation metrics. I ran each version few times and in all cases it converges around those values.

此外，我几乎没有数据集可以插入到完全相同的模型中，并且问题仍然存在.

Moreover, I have few datasets that can be plugged into exactly the same model and the problem persists on all of them.

在 cudnn_rnn 的tensorflow 代码中我发现一句话说:

In the tensorflow code for cudnn_rnn I found a sentence saying:

Cudnn LSTM 和 GRU 在数学上与它们的 tf 不同同行.

Cudnn LSTM and GRU are mathematically different from their tf counterparts.

但是没有解释这些差异到底是什么......

But there's no explanation what those differences really are...

与 Tensorflow 中的常规 LSTMCell 相比，使用 CudnnLSTM 进行训练时的不同结果 [英] Different results while training with CudnnLSTM compared to regular LSTMCell in Tensorflow

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

与 Tensorflow 中的常规 LSTMCell 相比，使用 CudnnLSTM 进行训练时的不同结果 [英] Different results while training with CudnnLSTM compared to regular LSTMCell in Tensorflow

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭