带有回调的Gensim FastText模型加载失败 [英] Loading Gensim FastText Model with Callbacks Fails

查看:486
本文介绍了带有回调的Gensim FastText模型加载失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Gensim创建FastText模型后,我想加载它,但是遇到看似与回调有关的错误.

After creating a FastText model using Gensim, I want to load it but am running into errors seemingly related to callbacks.

用于创建模型的代码是

TRAIN_EPOCHS = 30
WINDOW = 5
MIN_COUNT = 50
DIMS = 256

vocab_model = gensim.models.FastText(sentences=model_input,
                                     size=DIMS,
                                     window=WINDOW,
                                     iter=TRAIN_EPOCHS,
                                     workers=6,
                                     min_count=MIN_COUNT,
                                     callbacks=[EpochSaver("./ftchkpts/")])

vocab_model.save('ft_256_min_50_model_30eps')

和回调EpochSaver被定义为

from gensim.models.callbacks import CallbackAny2Vec

class EpochSaver(CallbackAny2Vec):
    '''Callback to save model after each epoch and show training parameters '''

    def __init__(self, savedir):
        self.savedir = savedir
        self.epoch = 0
        os.makedirs(self.savedir, exist_ok=True)

    def on_epoch_end(self, model):
        savepath = os.path.join(self.savedir, f"ft256_{self.epoch}e")
        model.save(savepath)
        print(f"Epoch saved: {self.epoch + 1}")
        if os.path.isfile(os.path.join(self.savedir, f"ft256_{self.epoch-1}e")):
            os.remove(os.path.join(self.savedir,  f"ft256_{self.epoch-1}e"))
            print("Previous model deleted ")
        self.epoch += 1

除了模型的类型外,这与我对Word2Vec的处理过程完全相同,而没有任何问题.但是,当我打开另一个文件并尝试使用以下方式加载模型时:

Aside from the type of model, this is identical to my process for Word2Vec which worked without issue. However when I open another file and try to load the model with

from gensim.models import FastText
vocab = FastText.load(r'vocab/ft_256_min_50_model_30eps')

我遇到了错误

AttributeError: Can't get attribute 'EpochSaver' on <module '__main__'>

我该怎么做才能加载词汇表,以便可以为我的keras模型创建嵌入层?如果相关的话,这就是在JupyterLab中发生的.

What can I do to get the vocabulary to load so I can create the embedding layer for my keras model? If it's relevant, this is happening in JupyterLab.

推荐答案

使用自定义回调加载模型的额外困难是已知的未解决问题(至少通过gensim-3.8.1和2019年10月).

This extra difficulty loading models with custom callbacks is a known, open issue (at least through gensim-3.8.1 and October 2019).

您可以在那里看到有关可能的解决方法和修复程序的讨论-gensim团队正在考虑完全禁用自动保存回调,要求在以后需要它们的每个train()/etc调用中重新指定它们.

You can see discussions of possible workarounds and fixes there – and the gensim team is considering simply disabling the auto-saving of callbacks at all, requiring them to be re-specified for each later train()/etc call that needs them.

通过将相同的回调类(以相同的名称)导入执行load()的代码上下文中,您可能能够加载用自定义回调保存的现有模型.

You may be able to load existing models saved with your custom callbacks by importing those same callback classes, as the same names, into the code context where you're doing a load().

您可以通过在save()之前将模型的callbacks属性空白为空的默认值来保存训练后的模型的无回调版本,例如:

You could save callback-free versions of your trained models by blanking the model's callbacks property to its empty default value, just before you save(), eg:

model.callbacks = ()
model.save(save_path)

然后,您无需在load()之前进行任何特殊的自定义类导入. (当然,如果您再次需要在重新加载的模型上使用回调功能,则必须在load()之后显式地重新建立它们).

Then, you wouldn't need to do any special importing of custom classes before a load(). (Of course if you again needed callback functionality on the re-loaded model, they'd then have to be explicitly reestablished after load()).

这篇关于带有回调的Gensim FastText模型加载失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆