如何从回调函数中断Word2vec培训? [英] How to break the Word2vec training from a callback function?

查看:101
本文介绍了如何从回调函数中断Word2vec培训?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用gensim word2vec训练一个skipgram模型。为了避免模型的过拟合,我想先退出训练,然后再根据不同数据集中的特定精度测试得出参数中传递的历元数。

I am training a skipgram model using gensim word2vec. I would like to exit the training before reaching the number of epochs passed in the parameters based on a specific accuracy test in a different set of data in order to avoid the overfitting of the model.

gensim中是否有办法从回调函数中中断word2vec的训练?

Is there a way in gensim to interrupt the train of word2vec from a callback function?

推荐答案

培训会使您的 Word2Vec 模型在某些外部评估上更糟,您的设置可能还有其他问题。 (例如,许多许多在线代码示例在循环中多次调用 train()会误导学习率 alpha 使其实际上变为负值,这意味着每个训练示例都会通过反向传播对模型进行反校正。)

If in fact more training makes your Word2Vec model worse on some external evaluation, there is likely something else wrong with your setup. (For example, many many online code examples that call train() multiple times in a loop mismanage the learning-rate alpha such that it actually goes negative, which would mean each training-example results in anti-corrections to the model via backpropagation.)

如果相反,主要问题确实是过度拟合,比有条件的提前停止更好的解决方案可能是调整其他参数,例如模型大小,以便无论进行多少次训练都不会超过有用的概括。

If instead the main problem is truly overfitting, a better solution than conditional early-stopping would probably be adjusting other parameters, such as the model size, so that it can't overshoot useful generalization no matter how many training passes are made.

但是如果您真的想尝试不太好的早期停止方法,则可能会在回调中引发一个可捕获的异常,并将其捕获到<$ c $之外。 c> train()允许您的其他代码继续执行中止训练的结果。例如...

But if you really want to try the less-good approach of early stopping, you could potentially raise a catchable exception in your callback, and catch it outside train() to allow your other code to continue with the results of the aborted training. For example...

自定义例外...

class OverfitException(Exception):
    pass

...然后在您的回调中...

...then in your callback...

    raise OverfitException()

...以及周围的培训...

...and around training...

try:
    model.train(...)
except OverfitException:
    print("training cut short")
# ... & your code with partially-trained model continues

但是,这又不是应对过度拟合或其他情况下进行更多培训似乎会损害评估分数的最佳方法。

But again, this is not the best way to deal with overfitting or other cases where more training is seeming to hurt evaluation-scores.

这篇关于如何从回调函数中断Word2vec培训?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆