带有回调的Gensim FastText模型加载失败 [英] Loading Gensim FastText Model with Callbacks Fails
问题描述
使用Gensim创建FastText模型后,我想加载它,但是遇到看似与回调有关的错误.
After creating a FastText model using Gensim, I want to load it but am running into errors seemingly related to callbacks.
用于创建模型的代码是
TRAIN_EPOCHS = 30
WINDOW = 5
MIN_COUNT = 50
DIMS = 256
vocab_model = gensim.models.FastText(sentences=model_input,
size=DIMS,
window=WINDOW,
iter=TRAIN_EPOCHS,
workers=6,
min_count=MIN_COUNT,
callbacks=[EpochSaver("./ftchkpts/")])
vocab_model.save('ft_256_min_50_model_30eps')
和回调EpochSaver
被定义为
from gensim.models.callbacks import CallbackAny2Vec
class EpochSaver(CallbackAny2Vec):
'''Callback to save model after each epoch and show training parameters '''
def __init__(self, savedir):
self.savedir = savedir
self.epoch = 0
os.makedirs(self.savedir, exist_ok=True)
def on_epoch_end(self, model):
savepath = os.path.join(self.savedir, f"ft256_{self.epoch}e")
model.save(savepath)
print(f"Epoch saved: {self.epoch + 1}")
if os.path.isfile(os.path.join(self.savedir, f"ft256_{self.epoch-1}e")):
os.remove(os.path.join(self.savedir, f"ft256_{self.epoch-1}e"))
print("Previous model deleted ")
self.epoch += 1
除了模型的类型外,这与我对Word2Vec的处理过程完全相同,而没有任何问题.但是,当我打开另一个文件并尝试使用以下方式加载模型时:
Aside from the type of model, this is identical to my process for Word2Vec which worked without issue. However when I open another file and try to load the model with
from gensim.models import FastText
vocab = FastText.load(r'vocab/ft_256_min_50_model_30eps')
我遇到了错误
AttributeError: Can't get attribute 'EpochSaver' on <module '__main__'>
我该怎么做才能加载词汇表,以便可以为我的keras模型创建嵌入层?如果相关的话,这就是在JupyterLab中发生的.
What can I do to get the vocabulary to load so I can create the embedding layer for my keras model? If it's relevant, this is happening in JupyterLab.
推荐答案
使用自定义回调加载模型的额外困难是已知的未解决问题(至少通过gensim-3.8.1
和2019年10月).
This extra difficulty loading models with custom callbacks is a known, open issue (at least through gensim-3.8.1
and October 2019).
您可以在那里看到有关可能的解决方法和修复程序的讨论-gensim团队正在考虑完全禁用自动保存回调,要求在以后需要它们的每个train()
/etc调用中重新指定它们.
You can see discussions of possible workarounds and fixes there – and the gensim team is considering simply disabling the auto-saving of callbacks at all, requiring them to be re-specified for each later train()
/etc call that needs them.
通过将相同的回调类(以相同的名称)导入执行load()
的代码上下文中,您可能能够加载用自定义回调保存的现有模型.
You may be able to load existing models saved with your custom callbacks by importing those same callback classes, as the same names, into the code context where you're doing a load()
.
您可以通过在save()
之前将模型的callbacks
属性空白为空的默认值来保存训练后的模型的无回调版本,例如:
You could save callback-free versions of your trained models by blanking the model's callbacks
property to its empty default value, just before you save()
, eg:
model.callbacks = ()
model.save(save_path)
然后,您无需在load()
之前进行任何特殊的自定义类导入. (当然,如果您再次需要在重新加载的模型上使用回调功能,则必须在load()
之后显式地重新建立它们).
Then, you wouldn't need to do any special importing of custom classes before a load()
. (Of course if you again needed callback functionality on the re-loaded model, they'd then have to be explicitly reestablished after load()
).
这篇关于带有回调的Gensim FastText模型加载失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!