如何使用Keras Imdb数据集预测情绪分析? [英] How to predict sentiment analysis using Keras imdb dataset?

查看:211
本文介绍了如何使用Keras Imdb数据集预测情绪分析?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用keras来实现情绪分析模型.我已经创建了模型并对其进行了训练.但是现在我不确定如何预测新数据,因为imdb数据集已经在矢量中了([22,33,4等...]).

I'm using keras to implement sentiment analysis model. I'v created the model and trained it. but now i'm not sure how to predict new data since the imdb dataset is already in vectors([22,33,4, etc...]).

那么我如何对新句子进行预言,例如:我喜欢这部电影"?

so how do i preform a prediction to a new sentence like: "i love this movie"?

from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, Convolution1D, Flatten, Dropout
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from keras.callbacks import TensorBoard

# Using keras to load the dataset with the top_words
top_words = 10000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)

# Pad the sequence to the same length
max_review_length = 1600
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)

# Using embedding from Keras
embedding_vecor_length = 300
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))

# Convolutional model (3x conv, flatten, 2x dense)
model.add(Convolution1D(64, 3, padding='same'))
model.add(Convolution1D(32, 3, padding='same'))
model.add(Convolution1D(16, 3, padding='same'))
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(180,activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(1,activation='sigmoid'))

# Log to tensorboard
tensorBoardCallback = TensorBoard(log_dir='./logs', write_graph=True)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(X_train, y_train, epochs=3, callbacks=[tensorBoardCallback], batch_size=64)

# Evaluation on the test set
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

model.save("trained_demo.h5")

推荐答案

您必须获得单词,索引对的字典.使用它,您可以将单词转换为索引,最后将其填充.

You have to get the dictionary of word, index pairs. Using that you can convert words to indexes, finally pad it.

from nltk import word_tokenize
from keras.preprocessing import sequence
word2index = imdb.get_word_index()
test=[]
for word in word_tokenize( "i love this movie"):
     test.append(word2index[word])

test=sequence.pad_sequences([test],maxlen=max_review_length)
model.predict(test)

这篇关于如何使用Keras Imdb数据集预测情绪分析?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆