如何在 keras 中重塑文本数据以适合 LSTM 模型 [英] how to reshape text data to be suitable for LSTM model in keras

查看：24 发布时间：2021/11/30 19:39:09 python tensorflow keras lstm autoencoder

本文介绍了如何在 keras 中重塑文本数据以适合 LSTM 模型的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

更新 1:

我所指的代码正是书中的代码，你可以找到它这里.

The code Im referring is exactly the code in the book which you can find it here.

唯一的问题是我不想在解码器部分使用 embed_size.这就是为什么我认为我根本不需要嵌入层，因为如果我放置嵌入层，我需要在解码器部分有 embed_size(如果我错了，请纠正我).

The only thing is that I don't want to have embed_size in the decoder part. That's why I think I don't need to have embedding layer at all because If I put embedding layer, I need to have embed_size in the decoder part(please correct me if Im wrong).

总的来说，我试图在不使用嵌入层的情况下采用相同的代码，因为我需要在解码器部分有 vocab_size.

Overall, Im trying to adopt the same code without using the embedding layer, because I need o have vocab_size in the decoder part.

我认为评论中提供的建议可能是正确的(使用 one_hot_encoding)，我曾经遇到过这个错误:

I think the suggestion provided in the comment could be correct (using one_hot_encoding) how ever I faced with this error:

当我做one_hot_encoding时:

tf.keras.backend.one_hot(indices=sent_wids, classes=vocab_size)

我收到此错误:

in check_num_samples你应该指定 + steps_name + 参数ValueError:如果你的数据是符号张量的形式，你应该指定steps_per_epoch参数(而不是batch_size参数，因为符号张量会产生批量输入数据)

我准备数据的方式是这样的:

The way that I have prepared data is like this:

sent_lens 的形状是 (87716, 200)，我想以一种可以将其输入 LSTM 的方式重塑它.这里 200 代表 sequence_lenght 和 87716 是我拥有的样本数.

shape of sent_lens is (87716, 200) and I want to reshape it in a way I can feed it into LSTM. here 200 stands for the sequence_lenght and 87716 is number of samples I have.

下面是LSTM Autoencoder的代码:

inputs = Input(shape=(SEQUENCE_LEN,VOCAB_SIZE), name="input")
encoded = Bidirectional(LSTM(LATENT_SIZE), merge_mode="sum", name="encoder_lstm")(inputs)
decoded = RepeatVector(SEQUENCE_LEN, name="repeater")(encoded)
decoded = LSTM(VOCAB_SIZE, return_sequences=True)(decoded)
autoencoder = Model(inputs, decoded)
autoencoder.compile(optimizer="sgd", loss='mse')
autoencoder.summary()
history = autoencoder.fit(Xtrain, Xtrain,batch_size=BATCH_SIZE, 
epochs=NUM_EPOCHS)

我是否还需要做任何额外的事情，如果不是，为什么我不能得到这个作品?

Do I still need to do anything extra, if No, why I can not get this works?

不清楚的地方请告诉我，我会解释.

Please let me know which part is not clear I will explain.

感谢您的帮助:)

如何在 keras 中重塑文本数据以适合 LSTM 模型 [英] how to reshape text data to be suitable for LSTM model in keras

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在 keras 中重塑文本数据以适合 LSTM 模型 [英] how to reshape text data to be suitable for LSTM model in keras

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭