如何提高张量流中单词rnn的准确性? [英] How to improve the word rnn accuracy in tensorflow?

查看:112
本文介绍了如何提高张量流中单词rnn的准确性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用tensorflow seq2seq.rnn_decoder进行标题自动生成项目.

I'm working on a title auto generate project with tensorflow seq2seq.rnn_decoder.

我的训练集是大量的标题,每个标题彼此独立并且不相关.

My training set is a big set of titles, each title is independent of each other and is not relevant.

我尝试了两种数据格式进行训练:

I had try two data format for training:

F1. Use the fixed seq length in batch, and replace ‘\n’ to ‘<eos>’, and ‘<eos>’ index is 1, which training batch is like: [2,3,4,5,8,9,1,2,3,4], [88,99,11,90,1,5,6,7,8,10]
F2. Use Variable seq length in batch, and add PAD 0 to keep the fixed length, which training batch is like: [2,3,4,5,8,9,0,0,0,0], [2,3,4,88,99,90,11,0,0,0]

然后我以一小套进行测试,该测试有10,000个标题,但结果让我感到困惑.

Then I do the test in a small set which has 10,000 titles, but the results make me confused.

F1可以很好地预测单个单词,就像这样:

F1 is make a good prediction in single word, like this:

iphone predict 6
samsung predict galaxy
case predict cover

如果从句子的第一个单词开始输入,则F2可以很好地预测长句子,很多时候该预测几乎等于原始句子.

F2 is make a good prediction in a long sentence if the input is start from the first word of sentence, many times the prediction is almost equals the original sentence.

但是,如果起始词是从句子的中间(或接近结尾)开始,那么F2的预测就非常糟糕,就像随机结果一样.

But, if the starting word is from the middle(or near end) of the sentence, F2’s prediction is very very bad, just like the random result.

这种情况与隐藏状态有关吗?

Is this situation related to the hidden state ?

在训练阶段,当新的纪元开始时,我将隐藏状态重置为0,因此该纪元中的所有批处理都将使用相同的隐藏状态,我怀疑这不是一个好习惯,因为每个句子实际上都是独立的,是否应该在训练中可以共享相同的隐藏状态?

In the training phase, I reset the hidden state to 0 when a new epoch begin, So all batch in epoch will be use the same hidden state, I suspect that this is not a good practice, because every sentences are actually independent, should it’s can share the same hidden state in training ?

在推断阶段,初始隐藏状态为0,&输入单词时更新. (清除输入后重置为0)

In the infer phase, the init hidden state is 0, & updated when feed a word. (reset to 0 when clear input)

所以我的问题是,当单词从句子的中间(或接近结尾)开始时,为什么F2的预测不好?在我的项目中更新隐藏状态的正确方法是什么?

So my question is why F2’s prediction is bad when starting word is from the middle (or near end) of the sentence ? And what is the right way to update hidden state in my project ?

推荐答案

我不确定我是否100%正确地理解了您的设置,但是我认为您看到的情况是可以预期的,并且与处理隐藏状态有关

I'm not sure I understand your setting 100% correctly, but I think what you see happening is expected and has to do with the handling of the hidden state.

让我们首先看看您在F2中看到的内容.由于您每次都会重置隐藏状态,因此网络在整个标题的开头只会看到0状态,对吗?因此,在训练过程中,除非开始序列,否则它可能永远不会具有0状态.当您尝试从中间解码时,您会从0状态开始,在训练过程中从未见过这样的位置,因此失败了.

Let's first look at what you see in F2. Since you reset your hidden state every time, the network only sees a 0-state at the beginning of a whole title, right? So, during training, it probably never has a 0-state except when starting the sequence. When you try to decode from the middle, you start from 0-state in a position it has never seen like this during training, so it fails.

在F1中,您还重置了状态,但是由于您没有进行填充,因此0状态在训练期间会更加随机地出现-有时在标题的开头,有时在标题的中间.并且网络学会了解决这个问题.

In F1, you also reset the state, but since you're not padding, the 0-state appears more randomly during training -- sometimes at the beginning, sometimes in the middle of the title. And the network learns to cope with this.

这篇关于如何提高张量流中单词rnn的准确性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆