NLTK Python中的词义消歧 [英] Word sense disambiguation in NLTK Python

查看:361
本文介绍了NLTK Python中的词义消歧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是NLTK Python的新手,我正在寻找一些可以消除单词歧义的示例应用程序.我在搜索结果中有很多算法,但没有示例应用程序.我只想传递一个句子,并想通过引用wordnet库来了解每个单词的含义. 谢谢

I am new to NLTK Python and i am looking for some sample application which can do word sense disambiguation. I have got a lot of algorithms in search results but not a sample application. I just want to pass a sentence and want to know the sense of each word by referring to wordnet library. Thanks

我在PERL中找到了类似的模块. http://marimba.d.umn.edu/allwords/allwords.html NLTK Python中是否存在这样的模块?

I have found a similar module in PERL. http://marimba.d.umn.edu/allwords/allwords.html Is there such module present in NLTK Python?

推荐答案

最近,部分pywsd代码已移植到wsd.py模块中NLTK'的最新版本中,请尝试:

Recently, part of the pywsd code has been ported into the bleeding edge version of NLTK' in the wsd.py module, try:

>>> from nltk.wsd import lesk
>>> sent = 'I went to the bank to deposit my money'
>>> ambiguous = 'bank'
>>> lesk(sent, ambiguous)
Synset('bank.v.04')
>>> lesk(sent, ambiguous).definition()
u'act as the banker in a game or in gambling'

为获得更好的WSD性能,请使用pywsd库而不是NLTK模块.通常,pywsd中的simple_lesk()优于NLTK中的lesk.有空时,我会尽量更新NLTK模块.

For better WSD performance, use the pywsd library instead of the NLTK module. In general, simple_lesk() from pywsd does better than lesk from NLTK. I'll try to update the NLTK module as much as possible when I'm free.

在回应Chris Spencer的评论时,请注意Lesk算法的局限性.我只是简单地给出算法的准确实现.这不是灵丹妙药, http://en.wikipedia.org/wiki/Lesk_algorithm

In responds to Chris Spencer's comment, please note the limitations of Lesk algorithms. I'm simply giving an accurate implementation of the algorithms. It's not a silver bullet, http://en.wikipedia.org/wiki/Lesk_algorithm

另外请注意,尽管:

lesk("My cat likes to eat mice.", "cat", "n")

没有给您正确的答案,您可以使用max_similarity()pywsd实现:

don't give you the right answer, you can use pywsd implementation of max_similarity():

>>> from pywsd.similarity import max_similiarity
>>> max_similarity('my cat likes to eat mice', 'cat', 'wup', pos='n').definition 
'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats'
>>> max_similarity('my cat likes to eat mice', 'cat', 'lin', pos='n').definition 
'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats'

@Chris,如果您想要python setup.py,只需提出一个礼貌的要求,我会写出来...

@Chris, if you want a python setup.py , just do a polite request, i'll write it...

这篇关于NLTK Python中的词义消歧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆