经过多次培训之后,我的Doc2Vec代码没有得到很好的结果.可能是什么问题? [英] My Doc2Vec code, after many loops of training, isn't giving good results. What might be wrong?

查看:84
本文介绍了经过多次培训之后,我的Doc2Vec代码没有得到很好的结果.可能是什么问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用以下代码训练Doc2Vec模型,其中tagged_data是我之前设置的TaggedDocument实例的列表:

I'm training a Doc2Vec model using the below code, where tagged_data is a list of TaggedDocument instances I set up before:

max_epochs = 40

model = Doc2Vec(alpha=0.025, 
                min_alpha=0.001)

model.build_vocab(tagged_data)

for epoch in range(max_epochs):
    print('iteration {0}'.format(epoch))
    model.train(tagged_data,
                total_examples=model.corpus_count,
                epochs=model.iter)
    # decrease the learning rate
    model.alpha -= 0.001
    # fix the learning rate, no decay
    model.min_alpha = model.alpha

model.save("d2v.model")
print("Model Saved")

稍后我检查模型结果时,它们并不理想.可能出了什么问题?

When I later check the model results, they're not good. What might have gone wrong?

推荐答案

请勿在试图执行alpha算术的循环中多次调用.train().

Do not call .train() multiple times in your own loop that tries to do alpha arithmetic.

这是不必要的,而且容易出错.

It's unnecessary, and it's error-prone.

具体来说,在上面的代码中,将原始0.025 alpha减少40个倍数会导致(0.025 - 40*0.001)-0.015最终alpha,这对于许多训练时期都是负数.但是alpha 学习率为负是没有道理的:它本质上是要求模型在错误方向上微调其预测,而不是在错误方向上微调其预测.在每次批量培训更新中,正确方向. (此外,由于model.iter默认为5,因此上述代码实际上执行了40 * 5培训通过– 200 –这可能不是有意识的意图.但这只会使代码阅读者和缓慢的培训感到困惑,并非完全破坏性的结果,例如alpha处理不当.)

Specifically, in the above code, decrementing the original 0.025 alpha by 0.001 forty times results in (0.025 - 40*0.001) -0.015 final alpha, which would also have been negative for many of the training epochs. But a negative alpha learning-rate is nonsensical: it essentially asks the model to nudge its predictions a little bit in the wrong direction, rather than a little bit in the right direction, on every bulk training update. (Further, since model.iter is by default 5, the above code actually performs 40 * 5 training passes – 200 – which probably isn't the conscious intent. But that will just confuse readers of the code & slow training, not totally sabotage results, like the alpha mishandling.)

这里也有其他常见的错误变体.如果将alpha减量0.0001,则40减量只会将最终的alpha减少为0.021 –而这种具有线性学习速率衰减的SGD(随机梯度下降)样式的正确做法是值以非常接近0.000"结束).如果用户开始使用max_epochs进行修改–毕竟,这是一个最重要的参数! –但是也不要每次都调整减量,它们可能会远远低于或低于0.000.

There are other variants of error that are common here, as well. If the alpha were instead decremented by 0.0001, the 40 decrements would only reduce the final alpha to 0.021 – whereas the proper practice for this style of SGD (Stochastic Gradient Descent) with linear learning-rate decay is for the value to end "very close to 0.000"). If users start tinkering with max_epochs – it is, after all, a parameter pulled out on top! – but don't also adjust the decrement every time, they are likely to far-undershoot or far-overshoot 0.000.

所以不要使用这种模式.

So don't use this pattern.

不幸的是,许多错误的在线示例已经相互复制了此反模式,在自己的epochsalpha处理中犯了严重错误.请不要复制他们的错误,并让他们的作者知道无论出现此问题的地方,他们都在误导人们.

Unfortunately, many bad online examples have copied this anti-pattern from each other, and make serious errors in their own epochs and alpha handling. Please don't copy their error, and please let their authors know they're misleading people wherever this problem appears.

可以通过更简单的替换来改进上面的代码:

The above code can be improved with the much-simpler replacement:

max_epochs = 40
model = Doc2Vec()  # of course, if non-default parameters needed, use them here
                   # but most users won't need to change alpha/min_alpha at all

model.build_vocab(tagged_data)
model.train(tagged_data, total_examples=model.corpus_count, epochs=max_epochs)

model.save("d2v.model")

在这里,.train()方法将精确执行epochs的请求数量,从而将内部有效alpha从其默认起始值​​平稳地减小到接近零. (很少需要更改开始的alpha,但是即使您愿意,只需在初始模型创建时设置一个新的非默认值就足够了.)

Here, the .train() method will do exactly the requested number of epochs, smoothly reducing the internal effective alpha from its default starting value to near-zero. (It's rare to need to change the starting alpha, but even if you wanted to, just setting a new non-default value at initial model-creation is enough.)

这篇关于经过多次培训之后,我的Doc2Vec代码没有得到很好的结果.可能是什么问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆