关于“理解 Keras LSTM"的疑问 [英] Doubts regarding `Understanding Keras LSTMs`

查看:14
本文介绍了关于“理解 Keras LSTM"的疑问的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 LSTM 的新手,正在阅读 了解 Keras LSTM 并了解了一些与 Daniel Moller 的漂亮答案相关的愚蠢疑问.

I am new to LSTMs and going through the Understanding Keras LSTMs and had some silly doubts related to a beautiful answer by Daniel Moller.

以下是我的一些疑问:

  1. Achieving one to many 部分下指定了 2 种方法,其中写道我们可以使用 stateful=True 来循环获取一步的输出并将其作为下一步的输入(需要 output_features == input_features).

  1. There are 2 ways specified under the Achieving one to many section where it’s written that we can use stateful=True to recurrently take the output of one step and serve it as the input of the next step (needs output_features == input_features).

一对多与重复向量图中,重复向量在所有时间步中作为输入,而在一对多与 stateful=True图中> 输出作为下一个时间步的输入.那么,我们不是通过使用 stateful=True 来改变层的工作方式吗?

In the One to many with repeat vector diagram, the repeated vector is fed as input in all the time-step, whereas in the One to many with stateful=True the output is fed as input in the next time step. So, aren't we changing the way the layers work by using the stateful=True?

在构建 RNN 时应遵循上述 2 种方法中的哪一种(使用重复向量或将前一个时间步输出作为下一个输入)?

Which of the above 2 approaches (using the repeat vector OR feeding the previous time-step output as the next input) should be followed when building an RNN?

One to many with stateful=True部分下,要改变one to many的行为,在手动循环预测的代码中,如何我们会知道 steps_to_predict 变量,因为我们事先不知道输出序列长度.

Under the One to many with stateful=True section, to change the behaviour of one to many, in the code for manual loop for prediction, how will we know the steps_to_predict variable because we don't know the ouput sequence length in advance.

我也不明白整个模型是如何使用 last_step output 生成 next_step 输出 的.它让我对 model.predict() 函数的工作感到困惑.我的意思是,model.predict() 不会同时预测整个输出序列,而不是循环遍历 no.要生成的输出序列(我仍然不知道其值)并执行 model.predict() 来预测给定迭代中的特定时间步长输出?

I also did not understand the way the entire model is using the last_step output to generate the next_step ouput. It has confused me about the working of model.predict() function. I mean, doesn't model.predict() simultaneously predict the entire output sequences at once rather than looping through the no. of output sequences (whose value I still don't know) to be generated and doing model.predict() to predict a specific time-step output in a given iteration?

我无法理解整个 Many to many 案例.任何其他链接都会有所帮助.

I couldn't understand the entire of Many to many case. Any other link would be helpful.

我知道我们使用 model.reset_states() 来确保新批次独立于前一批次.但是,我们是否手动创建序列的批次,以便一个批次跟随另一个批次,或者 Kerasstateful=True 模式下是否自动将序列分成这样的批次.

I understand that we use model.reset_states() to make sure that a new batch is independent of the previous batch. But, Do we manually create batches of sequence such that one batch follows another batch or does Keras in stateful=True mode automatically divides the sequence into such batches.

如果是手动完成,为什么会有人将数据集分成这样的批次,其中一部分序列在一个批次中,另一个在下一个批次中?

If it's done manually then, why would anyone divide the dataset into such batches in which a part of a sequence is in one batch and the other in the next batch?

最后,使用 stateful=True 的实际实现或示例/用例是什么(因为这似乎很不寻常)?我正在学习 LSTM,这是我第一次了解 Keras 中的 stateful.

At last, what are the practical implementation or examples/use-cases where stateful=True would be used(because this seems to be something unusual)? I am learning LSTMs and this is the first time I've been introduced to stateful in Keras.

谁能帮我解释我的愚蠢问题,以便我可以清楚地了解 Keras 中的 LSTM 实现?

Can anyone help me in explaining my silly questions so that I can be clear on LSTM implementation in Keras?

询问其中一些是为了澄清当前的答案,还有一些是为了其余的疑问

A.所以,基本上有状态让我们在每批之后保持或重置内部状态.那么,如果我们在每批训练后不断地重新设置内部状态,模型将如何学习?重置是否真的意味着重置参数(用于计算隐藏状态)?

A. So, basically stateful lets us keep OR reset the inner state after every batch. Then, how would the model learn if we keep on resetting the inner state again and again after each batch trained? Does resetting truely means resetting the parameters(used in computing the hidden state)?

B.在行 If stateful=False: 自动重置内部状态,重置最后一个输出步骤.重置最后一个输出步骤是什么意思?我的意思是,如果每个时间步都产生自己的输出,那么重置最后一个输出步骤意味着什么,而且也只是最后一个?

B. In the line If stateful=False: automatically resets inner state, resets last output step. What did you mean by resetting the last output step? I mean, if every time-step produces its own output then what does resetting of last output step mean and that too only the last one?

C.针对Question 2Question 4的第二点,我仍然没有得到你的在每次迭代之间操作批次和需要stateful((问题 2 的最后一行)只重置状态).我知道我们不知道在时间步长中生成的每个输出的输入.

C. In response to Question 2 and 2nd point of Question 4, I still didn't get your manipulate the batches between each iteration and the need of stateful((last line of Question 2) which only resets the states). I got to the point that we don't know the input for every output generated in a time-step.

因此,您将序列分解为 only one-step 的序列,然后使用 new_step = model.predict(last_step) 但是您怎么知道多长时间您是否需要一次又一次地执行此操作(循环必须有一个停止点)?另外,请解释 stateful 部分(在 Question 2 的最后一行).

So, you break the sequences into sequences of only one-step and then use new_step = model.predict(last_step) but then how do you know about how long do you need to do this again and again(there must be a stopping point for the loop)? Also, do explain the stateful part( in the last line of Question 2).

D.在One to many with stateful=True下的代码中,似乎for循环(手动循环)用于预测下一个单词,只是在测试时使用.模型是在火车时间结合了那个东西本身还是我们手动也需要在火车时间使用这个循环?

D. In the code under One to many with stateful=True, it seems that the for loop(manual loop) is used for predicting the next word is used just in test time. Does the model incorporates that thing itself at train time or do we manually need use this loop also at the train time?

E.假设我们正在做一些机器翻译工作,我认为在将整个输入(要翻译的语言)输入到输入时间步然后生成输出(翻译语言) 在每个时间步都将通过 manual loop 发生,因为现在我们以输入结束,并开始使用迭代在每个时间步产生输出.我做对了吗?

E. Suppose we are doing some machine translation job, I think the breaking of sequences will occur after the entire input(language to translate) has been fed to the input time-steps and then generation of outputs(translated language) at each time-step is going to take place via the manual loop because now we are ended up with the inputs and starting to produce output at each time-step using the iteration. Did I get it right?

F.由于 LSTM 的默认工作需要答案中提到的 3 件事,因此在序列中断的情况下,current_inputprevious_output 使用相同的向量馈送,因为它们的值在没有可用的当前输入是相同的吗?

F. As the default working of LSTMs requires 3 things mentioned in the answer, so in case of breaking of sequences, are current_input and previous_output fed with same vectors because their value in case of no current input being available is same?

G.在预测:部分下的ma​​ny to many with stateful=True下,代码如下:

G. Under the many to many with stateful=True under the Predicting: section, the code reads:

predicted = model.predict(totalSequences)
firstNewStep = predicted[:,-1:]

由于在当前序列中查找下一个单词的手动循环到现在都没有使用过,我怎么知道时间的count- 由 model.predict(totalSequences) 预测的步骤,以便随后使用 predict(predicted[:,-1:]) 的最后一步用于生成其余的序列?我的意思是,我怎么知道在 manual for loop 之前的 predicted = model.predict(totalSequences) 中产生的序列数(后来使用).

Since, the manual loop of finding the very next word in the current sequence hasn't been used up till now, how do I know the count of the time-steps that has been predicted by the model.predict(totalSequences) so that the last step from predicted(predicted[:,-1:]) will then later be used for generating the rest of the sequences? I mean, how do I know the number of sequences that have been produced in the predicted = model.predict(totalSequences) before the manual for loop (later used).

编辑 2:

.在 D 答案中,我仍然不知道我将如何训练我的模型?我知道使用手动循环(在训练期间)可能会非常痛苦,但是如果我不使用它,模型将如何在 我们想要 10 个未来步骤的情况下得到训练,我们无法立即输出它们因为我们没有必要的 10 个输入步骤?简单地使用 model.fit() 会解决我的问题吗?

I. In D answer I still didn't get how will I train my model? I understand that using the manual loop(during training) can be quite painful but then if I don't use it how will the model get trained in the circumstances where we want the 10 future steps, we cannot output them at once because we don't have the necessary 10 input steps? Will simply using model.fit() solve my problem?

.D 答案的最后一段,只有在您拥有每一步的预期输出的情况下,您才能使用 train_on_batch 逐步训练.但除此之外,我认为训练非常复杂或不可能..

II. D answer's last para, You could train step by step using train_on_batch only in the case you have the expected outputs of each step. But otherwise I think it's very complicated or impossible to train..

你能更详细地解释一下吗?

Can you explain this in more detail?

step by step 是什么意思?如果我没有后面序列的输出,这将如何影响我的训练?训练期间我还需要手动循环吗?如果没有,那么 model.fit() 函数会按预期工作吗?

What does step by step mean? If I don't have OR have the output for the later sequences , how will that affect my training? Do I still need the manual loop during training. If not, then will the model.fit() function work as desired?

III.我将 "repeat" 选项 解释为使用 repeat vector.使用重复向量是否仅适用于 one to many 情况而不适合 many to many 情况,因为后者将有许多输入向量可供选择(用作单个重复向量) ?您将如何将 repeat vector 用于 many to many 情况?

III. I interpreted the "repeat" option as using the repeat vector. Wouldn't using the repeat vector be just good for the one to many case and not suitable for the many to many case because the latter will have many input vectors to choose from(to be used as a single repeated vector) ? How will you use the repeat vector for the many to many case?

推荐答案

问题 3

理解问题 3 是理解其他问题的关键,所以,让我们先尝试一下.

Question 3

Understanding the question 3 is sort of a key to understand the others, so, let's try it first.

Keras 中的所有循环层都执行隐藏循环.这些循环对我们来说是完全不可见的,但是我们可以在最后看到每次迭代的结果.

All recurrent layers in Keras perform hidden loops. These loops are totally invisible to us, but we can see the results of each iteration at the end.

不可见迭代次数等于time_steps维度.因此,LSTM 的循环计算发生在步骤上.

The number of invisible iterations is equal to the time_steps dimension. So, the recurrent calculations of an LSTM happen regarding the steps.

如果我们通过 X 步传递输入,将会有 X 次不可见迭代.

If we pass an input with X steps, there will be X invisible iterations.

LSTM 中的每次迭代将采用 3 个输入:

Each iteration in an LSTM will take 3 inputs:

  • 这一步输入数据的各个切片
  • 层的内部状态
  • 最后一次迭代的输出

因此,以下面的示例图像为例,其中我们的输入有 5 个步骤:

So, take the following example image, where our input has 5 steps:

Keras 在一次预测中会做什么?

What will Keras do in a single prediction?

  • 第 0 步:
    • 采取输入的第一步,input_data[:,0,:] 一个切片形状为 (batch, 2)
    • 取内部状态(此时为零)
    • 采取最后一个输出步骤(第一步不存在)
    • 通过计算:
      • 更新内部状态
      • 创建一个输出步骤(输出 0)
      • 对输入进行下一步:input_data[:,1,:]
      • 获取更新后的内部状态
      • 取上一步生成的输出(输出0)
      • 通过相同的计算:
        • 再次更新内部状态
        • 再创建一个输出步骤(输出 1)
        • input_data[:,2,:]
        • 获取更新后的内部状态
        • 输出 1
        • 通过:
          • 更新内部状态
          • 创建输出 2

          依此类推,直到第 4 步.

          And so on until step 4.

          最后:

          • If stateful=False:自动重置内部状态,重置最后一个输出步骤
          • 如果stateful=True:保持内部状态,保持最后输出步骤
          • If stateful=False: automatically resets inner state, resets last output step
          • If stateful=True: keep inner state, keep last ouptut step

          您不会看到任何这些步骤.它看起来只是一次通过.

          You will not see any of these steps. It will look like just a single pass.

          但您可以选择:

          • return_sequences = True:返回每个输出步骤,形状(batch, steps, units)
            • 这正是多对多.您在输出中获得的步数与在输入中获得的步数相同
            • return_sequences = True: every output step is returned, shape (batch, steps, units)
              • This is exactly many to many. You get the same number of steps in the output as you had in the input
              • 这是多对一的.您为整个输入序列生成一个结果.

              现在,这回答了问题 2 的第二部分:是的,predict 会在您不注意的情况下计算所有内容.但是:

              Now, this answers the second part of your question 2: Yes, predict will compute everything without you noticing. But:

              输出步数等于输入步数

              问题 4

              现在,在进入问题 2 之前,让我们看一下 4,它实际上是答案的基础.

              Question 4

              Now, before going to the question 2, let's look at 4, which is actually the base of the answer.

              是的,应该手动完成批次划分.Keras 不会更改您的批次.那么,我为什么要划分一个序列呢?

              Yes, the batch division should be done manually. Keras will not change your batches. So, why would I want to divide a sequence?

              • 1、序列太大,一批不适合电脑或GPU的内存
              • 2,您想要执行问题 2 中发生的事情:在每个步骤迭代之间操作批次.
              • 1, the sequence is too big, one batch doesn't fit the computer's or the GPU's memory
              • 2, you want to do what is happening on question 2: manipulate the batches between each step iteration.

              在问题 2 中,我们正在预测未来".那么,输出步骤的数量是多少?嗯,这是您想要预测的数字.假设您试图根据过去预测您将拥有的客户数量.您可以决定预测未来一个月或 10 个月.您的选择.

              In question 2, we are "predicting the future". So, what is the number of output steps? Well, it's the number you want to predict. Suppose you're trying to predict the number of clients you will have based on the past. You can decide to predict for one month in the future, or for 10 months. Your choice.

              现在,您认为 predict 会立即计算整个事情是正确的,但请记住上面我所说的问题 3:

              Now, you're right to think that predict will calculate the entire thing at once, but remember question 3 above where I said:

              输出步数等于输入步数

              还要记住,第一个输出步骤是第一个输入步骤的结果,第二个输出步骤是第二个输入步骤的结果,依此类推.

              Also remember that the first output step is result of the first input step, the second output step is result of the second input step, and so on.

              但我们想要的是未来,而不是与前面的步骤一一对应的东西.我们希望结果步骤遵循最后"步骤.

              But we want the future, not something that matches the previous steps one by one. We want that the result step follows the "last" step.

              因此,我们面临一个限制:如果我们没有各自的输入,如何定义固定数量的输出步骤?(遥远未来的输入也是未来,所以,它们不存在)

              So, we face a limitation: how to define a fixed number of output steps if we don't have their respective inputs? (The inputs for the distant future are also future, so, they don't exist)

              这就是为什么我们将序列分解为只有一步的序列.所以predict也会只输出一步.

              That's why we break our sequence into sequences of only one step. So predict will also output only one step.

              当我们这样做时,我们有能力在每次迭代之间操作批次.我们有能力将输出数据(我们以前没有)作为输入数据.

              When we do this, we have the ability to manipulate the batches between each iteration. And we have the ability to take output data (which we didn't have before) as input data.

              并且有状态是必要的,因为我们希望这些步骤中的每一个都连接为一个序列(不要丢弃状态).

              And stateful is necessary because we want that each of these steps be connected as a single sequence (don't discard the states).

              我所知道的 stateful=True 的最佳实际应用是 问题 2 的答案.我们想在步骤之间操作数据.

              The best practical application of stateful=True that I know is the answer of question 2. We want to manipulate the data between steps.

              这可能是一个虚拟示例,但另一个应用程序是例如您从 Internet 上的用户接收数据.用户使用您的网站的每一天,您都会为您的模型提供更多的数据步骤(并且您希望以相同的顺序继续该用户之前的历史记录).

              This might be a dummy example, but another application is if you're for instance receiving data from a user on the internet. Each day the user uses your website, you give one more step of data to your model (and you want to continue this user's previous history in the same sequence).

              那么,最后是问题 1.

              Then, finally question 1.

              我想说:除非你需要,否则总是避免stateful=True.
              您不需要它来构建一对多网络,因此,最好不要使用它.

              I'd say: always avoid stateful=True, unless you need it.
              You don't need it to build a one to many network, so, better not use it.

              请注意,stateful=True 示例与 预测未来 示例相同,但您从一个步骤开始.这很难实现,由于手动循环,它的速度会更差.但是您可以控制输出步骤的数量,这在某些情况下可能是您想要的.

              Notice that the stateful=True example for this is the same as the predict the future example, but you start from a single step. It's hard to implement, it will have worse speed because of manual loops. But you can control the number of output steps and this might be something you want in some cases.

              计算上也会有差异.在这种情况下,我真的无法回答是否一个比另一个更好.但我不相信会有很大的不同.但网络是某种艺术",测试可能会带来有趣的惊喜.

              There will be a difference in calculations too. And in this case I really can't answer if one is better than the other. But I don't believe there will be a big difference. But networks are some kind of "art", and testing might bring funny surprises.

              我们不应将状态"误认为权重".它们是两个不同的变量.

              We should not mistake "states" with "weights". They're two different variables.

              • 权重:可学习的参数,它们永不重置.(如果你重置权重,你会失去模型学到的一切)
              • States:当前对一批序列的记忆(与我现在在序列上的哪个步骤以及我从这批中的特定序列"到这一步学到了什么有关).
              • Weights: the learnable parameters, they're never reset. (If you reset the weights, you lose everything the model learned)
              • States: current memory of a batch of sequences (relates to which step on the sequence I am now and what I have learned "from the specific sequences in this batch" up to this step).

              想象一下你正在看一部电影(一个序列).每一秒都让你建立记忆,比如角色的名字,他们做了什么,他们的关系是什么.

              Imagine you are watching a movie (a sequence). Every second makes you build memories like the name of the characters, what they did, what their relationship is.

              现在想象一下,您拿到了一部以前从未看过的电影,然后开始观看电影的最后一秒.你不会明白电影的结局,因为你需要这部电影的前一个故事.(各州)

              Now imagine you get a movie you never saw before and start watching the last second of the movie. You will not understand the end of the movie because you need the previous story of this movie. (The states)

              现在想象一下你看完了整部电影.现在您将开始观看新电影(新序列).你不需要记住上一部电影中发生了什么.如果您尝试加入电影",您会感到困惑.

              Now image you finished watching an entire movie. Now you will start watching a new movie (a new sequence). You don't need to remember what happened in the last movie you saw. If you try to "join the movies", you will get confused.

              在这个例子中:

              • 权重:您理解和解读电影的能力记住重要名称和动作的能力
              • 状态:在暂停的电影中,状态是对从开始到现在发生的事情的记忆.

              因此,状态是未学习的".状态是计算"出来的,是针对批次中的每个单独序列逐步构建的.这就是为什么:

              So, states are "not learned". States are "calculated", built step by step regarding each individual sequence in the batch. That's why:

              • 重置状态意味着从第 0 步开始新的序列(开始一部新电影)
              • 保持状态意味着继续上一步的相同序列(继续播放暂停的电影,或观看该故事的第 2 部分)

              状态正是使循环网络工作的原因,就好像它们具有过去步骤的记忆"一样.

              States are exactly what make recurrent networks work as if they had "memory from the past steps".

              在 LSTM 中,最后一个输出步骤是状态"的一部分.

              In an LSTM, the last output step is part of the "states".

              一个 LSTM 状态包含:

              An LSTM state contains:

              • 通过计算每一步更新的记忆矩阵
              • 最后一步的输出

              所以,是的:每一步都会产生自己的输出,但每一步都使用最后一步的输出作为状态.这就是 LSTM 的构建方式.

              So, yes: every step produces its own output, but every step uses the output of the last step as state. This is how an LSTM is built.

              • 如果你想继续"相同的序列,你需要记忆上一步的结果
              • 如果你想开始"一个新的序列,你不想记忆最后一步的结果(如果你不重置状态,这些结果将保持存储)

              你想什么时候停下来.你想预测未来有多少步?那是你的停止点.

              You stop when you want. How many steps in the future do you want to predict? That's your stopping point.

              假设我有一个包含 20 个步骤的序列.我想预测未来的 10 个步骤.

              Imagine I have a sequence with 20 steps. And I want to predict 10 steps in the future.

              在标准(无状态)网络中,我们可以使用:

              In a standard (non stateful) network, we can use:

              • 一次输入 19 个步骤(从 0 到 18)
              • 一次输出 19 个步骤(从 1 到 19)

              这是预测下一步"(注意 shift = 1 步).我们可以这样做,因为我们拥有所有可用的输入数据.

              This is "predicting the next step" (notice the shift = 1 step). We can do this because we have all the input data available.

              但是当我们想要 10 个未来的步骤时,我们不能一次输出它们,因为我们没有必要的 10 个输入步骤(这些输入步骤是未来的,我们需要模型先预测它们).

              But when we want the 10 future steps, we cannot output them at once because we don't have the necessary 10 input steps (these input steps are future, we need the model to predict them first).

              所以我们需要从现有数据中预测一个未来的步骤,然后使用这个步骤作为下一个未来步骤的输入.

              So we need to predict one future step from existing data, then use this step as input for the next future step.

              但我希望这些步骤都是相连的.如果我使用 stateful=False,模型会看到很多长度为 1 的序列".不,我们想要一个长度为 30 的序列.

              But I want that these steps are all connected. If I use stateful=False, the model will see a lot of "sequences of length 1". No, we want one sequence of length 30.

              这是一个很好的问题,你让我......

              This is a very good question and you got me ....

              有状态的一对多是我在写那个答案时的一个想法,但我从未使用过这个.我更喜欢重复"选项.

              The stateful one to many was an idea I had when writing that answer, but I never used this. I prefer the "repeat" option.

              您可以使用 train_on_batch 逐步训练,只有在您拥有每一步的预期输出的情况下.但除此之外,我认为训练非常复杂或不可能.

              You could train step by step using train_on_batch only in the case you have the expected outputs of each step. But otherwise I think it's very complicated or impossible to train.

              这是一种常见的方法.

              • 用网络生成一个压缩向量(这个向量可以是结果,也可以是生成的状态,或者两者都有)
              • 将此压缩向量用作另一个网络的初始输入/状态,手动逐步生成并在模型生成句尾"单词或字符时停止.

              也有不带手动环的固定尺寸型号.您假设您的句子的最大长度为 X 个单词.比这更短的结果句子以句尾"或空"字/字符完成.Masking 层在这些模型中非常有用.

              There are also fixed size models without the manual loop. You suppose your sentence has a maximum length of X words. The result sentences that are shorter than this are completed with "end of sentence" or "null" words/characters. A Masking layer is very useful in these models.

              您只提供输入.另外两件事(最后输出和内部状态)已经存储在有状态层中.

              You provide only the input. The other two things (last output and inner states) are already stored in the stateful layer.

              我将输入设为最后输出只是因为我们的特定模型预测下一步.这就是我们想要它做的.对于每个输入,下一步.

              I made the input = last output only because our specific model is predicting the next step. That's what we want it to do. For each input, the next step.

              我们通过训练中的转移序列来教授这一点.

              We taught this with the shifted sequence in training.

              没关系.我们只需要最后一步.

              It doesn't matter. We want only the last step.

              • 序列数由第一个:保存.
              • -1: 只考虑最后一步.
              • The number of sequences is kept by the first :.
              • And only the last step is considered by -1:.

              但是如果你想知道,你可以打印predicted.shape.在这个模型中它等于 totalSequences.shape.

              But if you want to know, you can print predicted.shape. It is equal to totalSequences.shape in this model.

              首先,我们不能使用一对多"模型来预测未来,因为我们没有这方面的数据.如果您没有序列步骤的数据,就不可能理解序列".

              First, we can't use "one to many" models to predict the future, because we don't have data for that. There is no possibility to understand a "sequence" if you don't have the data for the steps of the sequence.

              因此,这种类型的模型应该用于其他类型的应用程序.正如我之前所说,对于这个问题,我真的没有很好的答案.最好先有一个目标",然后我们再决定哪种模型更适合该目标.

              So, this type of model should be used for other types of applications. As I said before, I don't really have a good answer for this question. It's better to have a "goal" first, then we decide which kind of model is better for that goal.

              逐步"是指手动循环.

              如果没有后面步骤的输出,我认为是不可能训练的.它可能根本不是一个有用的模型.(但我不是什么都知道的人)

              If you don't have the outputs of later steps, I think it's impossible to train. It's probably not a useful model at all. (But I'm not the one that knows everything)

              如果您有输出,是的,您可以使用 fit 训练整个序列,而无需担心手动循环.

              If you have the outputs, yes, you can train the entire sequences with fit without worrying about manual loops.

              你对 III 的看法是对的.您不会在多对多中使用重复向量,因为您有不同的输入数据.

              And you're right about III. You won't use repeat vector in many to many because you have varying input data.

              一对多"和多对多"是两种不同的技术,每一种都有其优点和缺点.一种适用于某些应用程序,另一种适用于其他应用程序.

              "One to many" and "many to many" are two different techniques, each one with their advantages and disadvantages. One will be good for certain applications, the other will be good for other applications.

              这篇关于关于“理解 Keras LSTM"的疑问的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆