使用WordNet查找同义词,定义和例句 [英] To find synonyms, definitions and example sentences using WordNet

查看:483
本文介绍了使用WordNet查找同义词,定义和例句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要输入一个单词的输入文本文件.然后,我需要使用wordnet查找lemma_names,单词的同义词集的定义和示例.我读过这本书:使用NLTK 2.0食谱进行Python文本处理"以及使用NLTK进行自然语言处理",以帮助我朝这个方向发展.尽管我已经了解了如何使用终端来完成此操作,但是我无法使用文本编辑器来做到这一点.

I need to take an input text file with a one word. I then need to find the lemma_names, definition and examples of the synset of the word using wordnet. I have gone through the book : "Python Text Processing with NLTK 2.0 Cookbook" and also "Natural Language Processing using NLTK" to help me in this direction. Though I have understood how this can be done using the terminal, I'm not able to do the same using a text editor.

例如,如果输入文本中有单词"flabbergasted",则输出必须采用以下方式:

For example, if the input text has the word "flabbergasted", the output needs to be in this fashion:

吃惊的 (动词)吃惊的,吃惊的,吃惊的,克服的; 这使头脑感到困惑!" (形容词)傻眼的,傻眼的,吃惊的,震惊的,雷击的,傻眼的,傻眼的-仿佛惊奇而又惊奇地傻了一样; 由于拒绝看到事故,警察感到震惊"; 饱受摧残的al徒无语"; 被晋升的消息震惊了"

flabbergasted (verb) flabbergast, boggle, bowl over - overcome with amazement ; "This boggles the mind!" (adjective) dumbfounded , dumfounded , flabbergasted , stupefied , thunderstruck , dumbstruck , dumbstricken - as if struck dumb with astonishment and surprise; "a circle of policement stood dumbfounded by her denial of having seen the accident"; "the flabbergasted aldermen were speechless"; "was thunderstruck by the news of his promotion"

同义词集,定义和例句直接从WordNet获得!

The synsets, definitions and example sentences are obtained from WordNet directly!

我有以下代码:


from __future__ import division
import nltk
from nltk.corpus import wordnet as wn


tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
fp = open("inpsyn.txt")
data = fp.read()

#to tokenize input text into sentences

print '\n-----\n'.join(tokenizer.tokenize(data))# splits text into sentences

#to tokenize the tokenized sentences into words

tokens = nltk.wordpunct_tokenize(data)
text = nltk.Text(tokens)
words = [w.lower() for w in text]  
print words     #to print the tokens

for a in words:
    print a

syns = wn.synsets(a)
print "synsets:", syns

for s in syns:
    for l in s.lemmas:
        print l.name
    print s.definition
    print s.examples

我得到以下输出:


flabbergasted

['flabbergasted']
flabbergasted
synsets: [Synset('flabbergast.v.01'), Synset('dumbfounded.s.01')]
flabbergast
boggle
bowl_over
overcome with amazement
['This boggles the mind!']
dumbfounded
dumfounded
flabbergasted
stupefied
thunderstruck
dumbstruck
dumbstricken
as if struck dumb with astonishment and surprise
['a circle of policement stood dumbfounded by her denial of having seen the accident', 'the flabbergasted aldermen were speechless', 'was thunderstruck by the news of his promotion']

有没有一种方法可以检索词性以及引理名称组?

Is there a way to retrieve the part of speech along with the group of lemma names?

推荐答案

def synset(word):
    wn.synsets(word)

不返回任何内容,因此默认情况下,您会得到None

doesn't return anything so by default you get None

您应该写

def synset(word):
    return wn.synsets(word)

提取引理名称:

from nltk.corpus import wordnet
syns = wordnet.synsets('car')
syns[0].lemmas[0].name
>>> 'car'
[s.lemmas[0].name for s in syns]
>>> ['car', 'car', 'car', 'car', 'cable_car']


[l.name for s in syns for l in s.lemmas]
>>>['car', 'auto', 'automobile', 'machine', 'motorcar', 'car', 'railcar', 'railway_car', 'railroad_car', 'car', 'gondola', 'car', 'elevator_car', 'cable_car', 'car']

这篇关于使用WordNet查找同义词,定义和例句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆