"hidden"和"hidden"之间有什么区别和“输出"在PyTorch LSTM中? [英] What's the difference between "hidden" and "output" in PyTorch LSTM?
问题描述
我在理解PyTorch的LSTM模块(以及类似的RNN和GRU)的文档时遇到了麻烦.关于输出,它说:
I'm having trouble understanding the documentation for PyTorch's LSTM module (and also RNN and GRU, which are similar). Regarding the outputs, it says:
输出:输出(h_n,c_n)
Outputs: output, (h_n, c_n)
- 输出(seq_len,batch,hidden_size * num_directions):张量,包含每个t的RNN的最后一层的输出特征(h_t).如果已给定torch.nn.utils.rnn.PackedSequence作为输入,则输出也将是打包序列.
- h_n(num_layers * num_directions,batch,hidden_size):包含t = seq_len的隐藏状态的张量
- c_n(num_layers * num_directions,batch,hidden_size):包含t = seq_len的单元格状态的张量
- output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
- h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t=seq_len
- c_n (num_layers * num_directions, batch, hidden_size): tensor containing the cell state for t=seq_len
似乎变量output
和h_n
都给出了隐藏状态的值. h_n
只是多余地提供了output
中已经包含的最后一个时间步,还是还有其他东西?
It seems that the variables output
and h_n
both give the values of the hidden state. Does h_n
just redundantly provide the last time step that's already included in output
, or is there something more to it than that?
推荐答案
我绘制了一个图表.尽管我重命名了num_layers
,但这些名称遵循 PyTorch文档.到w
.
I made a diagram. The names follow the PyTorch docs, although I renamed num_layers
to w
.
output
包括最后一层中的所有隐藏状态(最后"深度方向,而不是时间方向). (h_n, c_n)
包含最后一个时间步 t = n 之后的隐藏状态,因此您可以将它们隐藏到另一个LSTM中.
output
comprises all the hidden states in the last layer ("last" depth-wise, not time-wise). (h_n, c_n)
comprises the hidden states after the last timestep, t = n, so you could potentially feed them into another LSTM.
不包括批次尺寸.
这篇关于"hidden"和"hidden"之间有什么区别和“输出"在PyTorch LSTM中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!