在批次之间传递LSTM状态的最佳方法 [英] The best way to pass the LSTM state between batches

查看:48
本文介绍了在批次之间传递LSTM状态的最佳方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试找到在批次之间传递LSTM状态的最佳方法.我已经搜索了所有内容,但是找不到当前实现的解决方案.假设我有类似的东西:

I am trying to find the best way to pass the LSTM state between batches. I have searched everything but I could not find a solution for the current implementation. Imagine I have something like:

cells = [rnn.LSTMCell(size) for size in [256,256]
cells = rnn.MultiRNNCell(cells, state_is_tuple=True)
init_state = cells.zero_state(tf.shape(x_hot)[0], dtype=tf.float32)
net, new_state = tf.nn.dynamic_rnn(cells, x_hot, initial_state=init_state ,dtype=tf.float32)

现在,我想在每个批次中高效地传递 new_state ,因此无需将其存储回内存,然后使用 feed_dict 将其重新馈送到tf.更精确地说,我找到的所有解决方案都使用 sess.run 评估 new_state feed-dict 并将其传递到 init_state.有没有办法避免使用 feed-dict 的瓶颈?

Now I would like to pass the new_state in each batch efficiently, so without storing it back to memory and then re-feed to tf using feed_dict. To be more precise, all the solutions I found use sess.run to evaluate new_state and feed-dict to pass it into init_state. Is there any way to do so without having the bottleneck of using feed-dict?

我认为我应该以某种方式使用 tf.assign ,但是文档不完整,找不到任何解决方法.

I think I should use tf.assign in some way but the doc is incomplete and I could not find any workaround.

我要感谢提前询问的每个人.

I want to thank everybody that will ask in advance.

干杯

弗朗切斯科·萨维里奥(Francesco Saverio)

Francesco Saverio

我在堆栈溢出中发现的所有其他答案都适用于旧版本,或使用"feed-dict"方法传递新状态.例如:

All the others answers that I found on stack overflow works for older version or use the 'feed-dict' method to pass the new state. For instance:

1) TensorFlow:记住下一批的LSTM状态(有状态LSTM)这可以通过使用"feed-dict"提供状态占位符来实现,而我想避免这种情况

1) TensorFlow: Remember LSTM state for next batch (stateful LSTM) This works by using 'feed-dict' to feed the state placeholder and I want to avoid that

2) Tensorflow-批处理中的LSTM状态重用处于扭曲状态

3)在Tensorflow中运行之间保存LSTM RNN状态在这里

推荐答案

LSTMStateTuple 只不过是输出和隐藏状态的元组. tf.assign 创建一个操作,该操作在运行时将张量中存储的值分配给变量(如果您有特定的问题,请询问以便改进文档).您可以通过使用元组的 c 属性从元组中检索隐藏状态张量,来将解决方案与 tf.assign 一起使用(假设您想要隐藏状态)- new_state.c

LSTMStateTuple is nothing more than a tuple of output and hidden state. tf.assign creates an operation that when run, assigns a value stored in a tensor to a variable (if you have specific questions, please ask so that docs can be improved). You can use the solution with tf.assign by retrieving the hidden state tensor using from the tuple using the c attribute of the tuple (assuming you want the hidden state) - new_state.c

以下是有关玩具问题的完整的独立示例: https://gist.github.com/iganichev/632b425fed0263d0274ec5b922aa3b2f

Here is a complete self-contained example on a toy problem: https://gist.github.com/iganichev/632b425fed0263d0274ec5b922aa3b2f

这篇关于在批次之间传递LSTM状态的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆