Keras文本预处理-将Tokenizer对象保存到文件中以进行评分 [英] Keras Text Preprocessing - Saving Tokenizer object to file for scoring
问题描述
我已经按照以下步骤(广泛地)使用Keras库训练了情感分类器模型.
I've trained a sentiment classifier model using Keras library by following the below steps(broadly).
- 使用令牌生成器对象/类将文本语料库转换为序列
- 使用model.fit()方法构建模型
- 评估此模型
现在使用该模型进行评分,我能够将模型保存到文件中并从文件中加载.但是我还没有找到一种方法来将Tokenizer对象保存到文件中.没有这个,我每次需要评分甚至一个句子时都必须处理语料库.有办法解决吗?
Now for scoring using this model, I was able to save the model to a file and load from a file. However I've not found a way to save the Tokenizer object to file. Without this I'll have to process the corpus every time I need to score even a single sentence. Is there a way around this?
推荐答案
最常见的方法是使用 pickle
或 joblib
.这里有一个有关如何使用pickle
来保存Tokenizer
的示例:
The most common way is to use either pickle
or joblib
. Here you have an example on how to use pickle
in order to save Tokenizer
:
import pickle
# saving
with open('tokenizer.pickle', 'wb') as handle:
pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)
# loading
with open('tokenizer.pickle', 'rb') as handle:
tokenizer = pickle.load(handle)
这篇关于Keras文本预处理-将Tokenizer对象保存到文件中以进行评分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!