单词 ngrams 的最大长度与上下文窗口大小之间的差异 [英] Difference between max length of word ngrams and size of context window

查看:40
本文介绍了单词 ngrams 的最大长度与上下文窗口大小之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在python的fasttext库的描述中

如我们所见,我们的训练数据中只有一个文本,即The quick brown fox jumps over the lazy dog.我们将上下文窗口定义为两个,这意味着我们将创建一个以中心词为中心的窗口,并且窗口内的后/前两个词是目标词.然后,我们一次移动这个窗口一个字.窗口大小越大,模型的训练样本越多,在给定小数据样本的情况下,模型越过拟合.

这是我们的第一个参数 ws.根据第二个参数wordNgrams,如果我们将wordNgrams设置为2,它将考虑如下图所示的双词对.(为简单起见,下图中的 ws 是一个)

参考

  • 检查这个

    In the description of the fasttext library for python https://github.com/facebookresearch/fastText/tree/master/python for training a supervised model there are different arguments, where among others are stated as:

    • ws: size of the context window
    • wordNgrams: max length of word ngram.

    If I understand it right, both of them are responsible for taking into account the surrounding words of the word, but what is the clear difference between them?

    解决方案

    First, we use the train_unsupervised API to create a Word-Representation Model. There are two techniques that we can use, skipgram and cbow. On the other hand, we use the train_supervised API to create Text Classification Model. You are asking about the train_supervised API, so I will stick to it.

    The way that text classification work in fasttext, is to first represent the word using skipgram by default. Then, use these word-vectors learned from the skipgram model to classify your input text. The two parameters that you asked about (ws and wordNgrams) are related to the skipgram/cbow model.

    The following image contains a simplified illustration of how we are using our input text to train skipgram model. Here, we defined the ws parameter as 2 and wordNgrams as 1.

    As we can see, we have only one text in our training data which is The quick brown fox jumps over the lazy dog. We defined the context window to be two, which means that we will create a window whose center is center word and the next/previous two words within the window are target words. Then, we move this window a word at a time. The bigger the window size is, the more training samples you have for your model, the more overfitted the model becomes given a small sample of data.

    That's for our first argument ws. According to the second argument wordNgrams, if we set wordNgrams to 2, it will consider two-word pairs like the following image. (The ws in the following image is one for simplicity)

    Ref

    • Check this link which contains the source code for the train_supervised method.

    • There is a major difference between skipgram and cbow that can be summarized in the following image:

    这篇关于单词 ngrams 的最大长度与上下文窗口大小之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆