“隐藏"和“隐藏"有什么区别?和“输出"在 PyTorch LSTM 中? [英] What's the difference between "hidden" and "output" in PyTorch LSTM?
问题描述
我无法理解 PyTorch 的 LSTM 模块(以及类似的 RNN 和 GRU)的文档.关于输出,它说:
<块引用>输出:输出,(h_n, c_n)
- output (seq_len, batch, hidden_size * num_directions):包含来自 RNN 最后一层的输出特征 (h_t) 的张量,对于每个 t.如果已将 torch.nn.utils.rnn.PackedSequence 作为输入,则输出也将是打包序列.
- h_n (num_layers * num_directions, batch, hidden_size):包含 t=seq_len 隐藏状态的张量
- c_n (num_layers * num_directions, batch, hidden_size):包含 t=seq_len 的细胞状态的张量
似乎变量 output
和 h_n
都给出了隐藏状态的值.h_n
是否只是冗余地提供了已经包含在 output
中的最后一个时间步,或者还有更多的东西吗?
我做了一个图表.名称遵循
不包括批次维度.
I'm having trouble understanding the documentation for PyTorch's LSTM module (and also RNN and GRU, which are similar). Regarding the outputs, it says:
Outputs: output, (h_n, c_n)
- output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
- h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t=seq_len
- c_n (num_layers * num_directions, batch, hidden_size): tensor containing the cell state for t=seq_len
It seems that the variables output
and h_n
both give the values of the hidden state. Does h_n
just redundantly provide the last time step that's already included in output
, or is there something more to it than that?
I made a diagram. The names follow the PyTorch docs, although I renamed num_layers
to w
.
output
comprises all the hidden states in the last layer ("last" depth-wise, not time-wise). (h_n, c_n)
comprises the hidden states after the last timestep, t = n, so you could potentially feed them into another LSTM.
The batch dimension is not included.
这篇关于“隐藏"和“隐藏"有什么区别?和“输出"在 PyTorch LSTM 中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!