wordnet中引理和同义词集有什么联系或区别? [英] What is the connection or difference between lemma and synset in wordnet?

查看:32
本文介绍了wordnet中引理和同义词集有什么联系或区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 NLP 和 NLTK 的完全初学者.

我无法理解 wordnet 中引理和同义词集之间的确切区别,因为两者产生的输出几乎相同.例如,对于单词 cake,它会产生此输出.

引理:[引理('cake.n.01.cake'),引理('patty.n.01.cake'),引理('cake.n.03.cake'),引理('coat.v.03.cake')]Synsets : [Synset('cake.n.01'), Synset('patty.n.01'), Synset('cake.n.03'), Synset('coat.v.03')]

请帮助我理解这个概念.

谢谢.

解决方案

这些术语基于单词引理"和同义词"的一般含义.

A lemma 是 wordnet 版本的词典条目:规范形式的单词,具有单一含义.例如,如果您想在字典中查找banks",则规范形式将是bank",并且对于表示金融机构"和河边"的名词会有单独的引理,对于动词to bank (on)"等

术语synset 代表同义词集".一组同义词是一组具有相似含义的词,例如ship、skiff、canoe、kayak 可能都是 boat 的同义词.在nltk中,synset实际上是一组具有相关含义的引理.以你的例子(wn.synsets("cake")wn.lemmas("cake") 的结果),我们也可以这样写:

<预><代码>>>>同义词集[0]Synset('cake.n.01')>>>同义词集[0].lemmas()[引理('cake.n.01.cake'), 引理('cake.n.01.bar')]

这些是构成cake"的第一个同义词集的引理.

Wordnet 提供了许多方法,让您可以探索上位词/下位词、使用域等关系.有关更多信息,您应该直接查看 Wordnet 文档;nltk 只是为它提供了一个接口.这是 Wordnet 词汇表.

I am a complete beginner to NLP and NLTK.

I was not able to understand the exact difference between lemmas and synsets in wordnet, because both are producing nearly the same output. for example for the word cake it produce this output.

lemmas :  [Lemma('cake.n.01.cake'), Lemma('patty.n.01.cake'), Lemma('cake.n.03.cake'), Lemma('coat.v.03.cake')]

synsets :  [Synset('cake.n.01'), Synset('patty.n.01'), Synset('cake.n.03'), Synset('coat.v.03')]

please help me to understand this concept.

Thank you.

解决方案

The terms are based on the general sense of the words "lemma" and "synonym".

A lemma is wordnet's version of an entry in a dictionary: A word in canonical form, with a single meaning. E.g., if you wanted to look up "banks" in the dictionary, the canonical form would be "bank" and there would be separate lemmas for the nouns meaning "financial institution" and "side of the river", a separate one for the verb "to bank (on)", etc.

The term synset stands for "set of synonyms". A set of synonyms is a set of words with similar meaning, e.g. ship, skiff, canoe, kayak might all be synonyms for boat. In the nltk, a synset is in fact a set of lemmas with related meaning. Taking your example (the results of wn.synsets("cake") and wn.lemmas("cake")), we can also write:

>>> synsets[0]
Synset('cake.n.01')
>>> synsets[0].lemmas()
[Lemma('cake.n.01.cake'), Lemma('cake.n.01.bar')]

These are the lemmas making up the first synset given for "cake".

Wordnet provides a lot of methods that allow you to explore relationships like hypernyms/hyponyms, usage domains, and more. For more information, you should look directly in the Wordnet documentation; the nltk just provides an interface for it. Here is the Wordnet glossary.

这篇关于wordnet中引理和同义词集有什么联系或区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆