在Keras IMDB示例中使用字符串作为输入 [英] Use string as input in Keras IMDB example

查看：79 发布时间：2020/4/25 10:44:31 tensorflow machine-learning nlp keras tensorflow-serving

本文介绍了在Keras IMDB示例中使用字符串作为输入的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在查看 Keras IMDB电影评论情感分类示例(和github上的相应模型) ，从而学会判断评论是正面还是负面.

I was looking at the Keras IMDB Movie reviews sentiment classification example (and the corresponding model on github), which learns to decide whether a review is positive or negative.

数据已经过预处理，因此每个评论都被编码为整数序列，例如评论这部电影很棒！"将为[11, 17, 6, 1187]，对于此输入，模型给出的输出为正".

The data has been preprocessed such that each review is encoded as a sequence of integers, e.g. the review "This movie is awesome!" would be [11, 17, 6, 1187] and for this input the model gives the output 'positive'.

数据集还提供了用于编码序列的单词索引，即我知道地图

The dataset also makes available the word index used for encoding the sequences, i.e. I know the map

This: 11
movie: 17
is: 6
awesome: 1187
...

我可以以某种方式将这种知识包含到模型中，以便其输入为字符串，即基于输入内容这部电影很棒！"做出预测吗?

Can I somehow include this knowledge into the model such that its input is a string, i.e. it gives a prediction based on the input "This movie is awesome!"?

推荐答案

首先，神经网络的输入永远不会是字符串，它只是词汇表中单词(或字符)索引的列表.该模型通常要做的第一件事就是嵌入转换(请参见 the示例)，将这些索引进一步转换为(可训练的)浮点向量.

First up, the input to the neural network is never a string, it's exactly a list of indices of words (or characters) in a vocabulary. And the first thing the model usually does is embedding transformation (see the example) which further converts these indices into the (trainable) float vectors.

您真正的意思是数据预处理步骤，该步骤将来自用户的原始输入(可以是文本，图像像素，录音等)转换为适合并方便使用的格式该模型.就像模型本身一样，数据预处理是机器学习应用程序的重要组成部分，应单独存储.如果打算使用imdb数据集，则词汇表已经过预处理.您可以在喀拉拉语中呼叫imdb.get_word_index()以获得单词索引，也可以使用词汇表json文件直接.

What you really mean is data pre-processing step that transforms the raw input from the user (can be text, image pixels, sound recording, etc) into a format that is suitable and convenient for the model. Data pre-processing is an essential part of the machine-learning application just like the model itself, and should be stored separately. If you intend to work with imdb dataset, the vocabulary is already pre-processed. You can call imdb.get_word_index() in keras to get the word index or you can work with the vocabulary json file directly.

这篇关于在Keras IMDB示例中使用字符串作为输入的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Keras IMDB示例中使用字符串作为输入 [英] Use string as input in Keras IMDB example

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

在Keras IMDB示例中使用字符串作为输入 [英] Use string as input in Keras IMDB example

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭