如何使用Python NLTK在WordNet同义词集中仅打印单词本身? [英] How do I print out just the word itself in a WordNet synset using Python NLTK?

查看:111
本文介绍了如何使用Python NLTK在WordNet同义词集中仅打印单词本身?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Python 2.7中是否有一种方法可以使用NLTK来获取单词,而不使用包含"synset"和括号以及"n.01"等的额外格式?

Is there a way in Python 2.7 using NLTK to just get the word and not the extra formatting that includes "synset" and the parentheses and the "n.01" etc?

例如,如果我这样做

        wn.synsets('dog')

我的结果如下:

[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]

我该如何获取这样的列表?

How can I instead get a list like this?

dog
frump
cad
frank
pawl
andiron
chase

是否可以使用NLTK来执行此操作,还是必须使用regular expressions?我可以在python脚本中使用regular expressions吗?

Is there a way to do this using NLTK or do I have to use regular expressions? Can I use regular expressions within a python script?

推荐答案

尝试一下:

for synset in wn.synsets('dog'):
    print synset.lemmas[0].name

您要遍历狗的每个同义词集,然后打印出该同义词集的标题.请记住,多个单词可以附加到同一个同义词集,因此,如果要获取与dog的所有同义词集相关联的所有单词,则可以执行以下操作:

You want to iterate over each synset for dog, and then print out the headword of the synset. Keep in mind that multiple words could attach to the same synset, so if you want to get all the words associated with all the synsets for dog, you could do:

for synset in wn.synsets('dog'):
    for lemma in synset.lemmas:
        print lemma.name

这篇关于如何使用Python NLTK在WordNet同义词集中仅打印单词本身?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆