Gensim等同于培训步骤 [英] Gensim equivalent of training steps

查看：81 发布时间：2020/5/18 1:05:11 python tensorflow nlp word2vec gensim

本文介绍了Gensim等同于培训步骤的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

gensim Word2Vec是否具有与TensorFlow word2vec示例中的训练步骤"等效的选项，请参见:

Does gensim Word2Vec have an option that is the equivalent of "training steps" in the TensorFlow word2vec example here: Word2Vec Basic? If not, what default value does gensim use? Is the gensim parameter iter related to training steps?

TensorFlow脚本包含此部分.

The TensorFlow script includes this section.

with tf.Session(graph=graph) as session:
    # We must initialize all variables before we use them.
    init.run()
    print('Initialized')

    average_loss = 0
    for step in xrange(num_steps):
        batch_inputs, batch_labels = generate_batch(
            batch_size, num_skips, skip_window)
        feed_dict = {train_inputs: batch_inputs, train_labels: batch_labels}

    # We perform one update step by evaluating the optimizer op (including it
    # in the list of returned values for session.run()
    _, loss_val = session.run([optimizer, loss], feed_dict=feed_dict)
    average_loss += loss_val

    if step % 2000 == 0:
        if step > 0:
            average_loss /= 2000
        # The average loss is an estimate of the loss over the last 2000 batches.
        print('Average loss at step ', step, ': ', average_loss)
        average_loss = 0

    # Note that this is expensive (~20% slowdown if computed every 500 steps)
    if step % 10000 == 0:
        sim = similarity.eval()
        for i in xrange(valid_size):
            valid_word = reverse_dictionary[valid_examples[i]]
            top_k = 8  # number of nearest neighbors
            nearest = (-sim[i, :]).argsort()[1:top_k + 1]
            log_str = 'Nearest to %s:' % valid_word
            for k in xrange(top_k):
                close_word = reverse_dictionary[nearest[k]]
                log_str = '%s %s,' % (log_str, close_word)
            print(log_str)
  final_embeddings = normalized_embeddings.eval()

在TensorFlow示例中，如果我对嵌入执行T-SNE并使用matplotlib进行绘制，则在步骤数很多时，该绘制对我来说看起来更合理. 我正在使用一小部分1200封电子邮件.看起来更合理的一种方法是将数字聚集在一起.我想使用gensim达到相同的外观质量水平.

In the TensorFlow example, if I perform T-SNE on the embeddings and plot with matplotlib, the plot looks more reasonable to me when the number of steps is high. I am using a small corpus of 1,200 emails. One way it looks more reasonable is that numbers are clustered together. I would like to attain the same apparent level of quality using gensim.

Gensim等同于培训步骤 [英] Gensim equivalent of training steps

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Gensim等同于培训步骤 [英] Gensim equivalent of training steps

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭