Keras文本预处理-将Tokenizer对象保存到文件中以进行评分 [英] Keras Text Preprocessing - Saving Tokenizer object to file for scoring

查看:979
本文介绍了Keras文本预处理-将Tokenizer对象保存到文件中以进行评分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经按照以下步骤(广泛地)使用Keras库训练了情感分类器模型.

I've trained a sentiment classifier model using Keras library by following the below steps(broadly).

  1. 使用令牌生成器对象/类将文本语料库转换为序列
  2. 使用model.fit()方法构建模型
  3. 评估此模型

现在使用该模型进行评分,我能够将模型保存到文件中并从文件中加载.但是我还没有找到一种方法来将Tokenizer对象保存到文件中.没有这个,我每次需要评分甚至一个句子时都必须处理语料库.有办法解决吗?

Now for scoring using this model, I was able to save the model to a file and load from a file. However I've not found a way to save the Tokenizer object to file. Without this I'll have to process the corpus every time I need to score even a single sentence. Is there a way around this?

推荐答案

最常见的方法是使用 pickle joblib .这里有一个有关如何使用pickle来保存Tokenizer的示例:

The most common way is to use either pickle or joblib. Here you have an example on how to use pickle in order to save Tokenizer:

import pickle

# saving
with open('tokenizer.pickle', 'wb') as handle:
    pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)

# loading
with open('tokenizer.pickle', 'rb') as handle:
    tokenizer = pickle.load(handle)

这篇关于Keras文本预处理-将Tokenizer对象保存到文件中以进行评分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆