了解Flatten在Keras中的作用并确定何时使用 [英] Understand the role of Flatten in Keras and determine when to use it

查看:72
本文介绍了了解Flatten在Keras中的作用并确定何时使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解为时间序列预测开发的模型.它使用Con1D层和两个LSTM层,然后使用致密层.我的问题是,它应该在LSTM和Denser层之间使用 Flatten()吗?在我看来,输出应该只有一个值,其形状为(None,1),可以通过在LSTM和致密层.如果没有 Flatten(),则输出形状将为(None,30,1).另外,我可以从第二个LSTM层中删除 return_sequences = True ,我认为它与 Flatten()具有相同的作用.哪一种是更合适的方式?它们会影响损失吗?这是模型.

I am trying to understand a model developed for time series forecasting. It uses a Con1D layer and two LSTM layers and after that, a dense layer. My question is, should it use Flatten() between the LSTM and the Denser layer? In my mind, the output should just have one value, which has a shape of (None, 1), and it can be achieved by using Flatten() between LSTM and Dense layer. Without the Flatten(), the output shape would be (None, 30, 1). Alternatively, I can remove the return_sequences=True from the second LSTM layer, which I think has the same effect as the Flatten(). Which one is a more appropriate way? Do they affect the loss? Here is the model.

model = tf.keras.models.Sequential([
    tf.keras.layers.Conv1D(filters=32, kernel_size=3, strides=1, padding="causal", activation="relu", input_shape=(30 ,1)),
    tf.keras.layers.LSTM(32, return_sequences=True),
    tf.keras.layers.LSTM(32, return_sequences=True),
    # tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(1),
    ])

这是不带 Flatten()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d (Conv1D)              (None, 30, 32)            128       
_________________________________________________________________
lstm (LSTM)                  (None, 30, 32)            8320      
_________________________________________________________________
lstm_1 (LSTM)                (None, 30, 32)            8320      
_________________________________________________________________
dense (Dense)                (None, 30, 1)             33        
=================================================================
Total params: 16,801
Trainable params: 16,801
Non-trainable params: 0
_________________________________________________________________

推荐答案

好吧,这取决于您要实现的目标.我尝试给您一些提示,因为我对您要获得的信息不是100%清楚.

Well, it depends on what you want to achieve. I try to give you some hints, because is not 100% clear for me what you want to obtain.

如果您的LSTM使用 return_sequences = True ,则您将返回每个LSTM单元的输出,即每个时间戳的输出.如果您要添加一个致密层,则其中一个将被添加到每个LSTM层的顶部.

If your LSTM uses return_sequences=True, than you are returning the output of each LSTM cell, i.e., an output for each timestamps. If you than add a dense layer, one of them will be add on the top of each LSTM layer.

如果您将扁平化层与 return_sequences = True 一起使用,则基本上您将消除时空维度,在您的情况下应具有(None,30).然后,您可以添加密集的图层或所需的图层.

If you use the flatten layer with the return_sequences=True, than you are basically removing the temporal dimension, having something like (None, 30) in your case. Than, you can can add a dense layer or wathever you need.

如果设置 return_sequences = False ,则只会在LSTM和处获得输出(请注意,在任何情况下,由于LSTM的作用,它都是基于在先前的时间戳记),并且输出将具有形状(无,暗淡),其中暗淡等于您在LSTM中使用的隐藏单位数(即32).同样,在这里,您只需添加一个带有一个隐藏单元的密集层即可获得所需的内容.

If you set return_sequences=False, you just get the output at the very and of your LSTM (note that in any case, due to the LSTM functioning it is based on the computation happened at the previous timestamps), and the output will be of the shape (None, dim) where dim is equals to the number of hidden units you are using in your LSTM (i.e., 32). Here, again, you can simply add a dense layer with one hidden unit, to have what you are looking for.

这篇关于了解Flatten在Keras中的作用并确定何时使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆