word2vec:负采样(外行术语)? [英] word2vec: negative sampling (in layman term)?

查看:19
本文介绍了word2vec:负采样(外行术语)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读下面的论文,但在理解负采样的概念方面遇到了一些麻烦.

I'm reading the paper below and I have some trouble , understanding the concept of negative sampling.

http://arxiv.org/pdf/1402.3722v1.pdf

有人可以帮忙吗?

推荐答案

word2vec 的想法是最大化出现在一起的单词的向量之间的相似性(点积)(在上下文中)彼此)在文本中,并最小化不相似的单词的相似性.在您链接到的论文的等式 (3) 中,暂时忽略求幂.你有

The idea of word2vec is to maximise the similarity (dot product) between the vectors for words which appear close together (in the context of each other) in text, and minimise the similarity of words that do not. In equation (3) of the paper you link to, ignore the exponentiation for a moment. You have

      v_c . v_w
 -------------------
   sum_i(v_ci . v_w)

分子基本上是词c(上下文)和w(目标)词之间的相似度.分母计算所有其他上下文ci 和目标词w 的相似度.最大化这个比率可以确保在文本中出现得更近的单词比没有出现的单词具有更多的相似向量.然而,计算这个可能会很慢,因为有很多上下文ci.负采样是解决这个问题的方法之一——只需随机选择几个上下文ci.最终的结果是,如果cat出现在food的上下文中,那么food的向量更类似于的向量cat(以它们的点积衡量)比其他几个随机选择的词(例如democracygreedFreddy),而不是语言中的所有其他词.这使得 word2vec 的训练速度要快得多.

The numerator is basically the similarity between words c (the context) and w (the target) word. The denominator computes the similarity of all other contexts ci and the target word w. Maximising this ratio ensures words that appear closer together in text have more similar vectors than words that do not. However, computing this can be very slow, because there are many contexts ci. Negative sampling is one of the ways of addressing this problem- just select a couple of contexts ci at random. The end result is that if cat appears in the context of food, then the vector of food is more similar to the vector of cat (as measures by their dot product) than the vectors of several other randomly chosen words (e.g. democracy, greed, Freddy), instead of all other words in language. This makes word2vec much much faster to train.

这篇关于word2vec:负采样(外行术语)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆