完全可理解的词典/词典 [英] Fully parsable dictionary/thesaurus
问题描述
谢谢
p> nltk wordnet语料库为英文单词的大型词汇数据库提供了编程界面。您可以根据各种关系导航字图。它符合显示定义,词性,同义词,反义词,引号和从理想下载的字典的要求。
另一个选项将下载维基词典数据的最新快照,并将其解析为可以使用的格式,但这可能有点涉及(除非一个体面的维基词典解析器已经存在)。
以下是使用Wordnet打印某些属性的示例:
从nltk.corpus导入导入文本包
wordnet as wn
POS = {
'v':'动词','a':'形容词','s':'卫星形容词',
'n' :'noun','r':'adverb'}
def info(word,pos = None):
for i,syn in enumerate(wn.synsets(word,pos) ):
syns = [n.replace('_','')for n in syn.lemma_names]
ants = [a for m in syn.lemmas for a in m.antonyms()]
ind =''* 12
defn = textwrap.wrap(syn.definition,64)
print '感觉%d(%s)'%(i + 1,POS [syn.pos])
print'定义:'+('\\\
'+ ind).join(defn)
打印'同义词',','.join(syns)
如果ants:
print'反义词:',','.join(an.name for a in ants)
if syn.examples:
print'examples:'+('\\\
'+ ind).join(syn.examples)
print
info('near')
输出:
感觉1(动词)
定义:移向
同义词:逼近,接近,来,上升,画近,画近,接近
示例:我们正在接近我们的目的地
他们正在画近
敌军越来越近
sense 2(形容词)
定义:不远处在时间或空间或程度或情况下
同义词:near,close,nigh
反义词:far
示例:近邻
在不久的将来
他们接近等于
...
I'm in the early stages of designing a series of simple word games which I hope will help me learn new words. A crucial part of the ideas that I have is a fully parsable dictionary; I want to be able to use regular expressions to search the dictionary for given words and extract certain other bits of information (e.g. definition, type (noun/verb...), synonyms, antonyms, quotes demonstrating the word in use, etc). I currently have Wordbook (mac app) which I find okay, but haven't figured out if I can parse it using a python script. I'm assuming I can't, and was wondering if anyone knows of a reasonable dictionary that will allow this. Ideally I would do all this independent of the internet.
Thanks
The nltk wordnet corpus provides a programmatic interface to a "large lexical database of English words". You can navigate the word graph based on a variety of relationships. It meets the requirements for showing "definition, part-of-speech, synonyms, antonyms, quotes", and "from a dictionary which is ideally downloadable".
Another option would be to download a recent snapshot of Wiktionary data and parse it into a format you can use, but this may be a bit involved (unless a decent Python Wiktionary parser already exists).
Here is an example of printing out some attributes using Wordnet:
import textwrap
from nltk.corpus import wordnet as wn
POS = {
'v': 'verb', 'a': 'adjective', 's': 'satellite adjective',
'n': 'noun', 'r': 'adverb'}
def info(word, pos=None):
for i, syn in enumerate(wn.synsets(word, pos)):
syns = [n.replace('_', ' ') for n in syn.lemma_names]
ants = [a for m in syn.lemmas for a in m.antonyms()]
ind = ' '*12
defn= textwrap.wrap(syn.definition, 64)
print 'sense %d (%s)' % (i + 1, POS[syn.pos])
print 'definition: ' + ('\n' + ind).join(defn)
print ' synonyms:', ', '.join(syns)
if ants:
print ' antonyms:', ', '.join(a.name for a in ants)
if syn.examples:
print ' examples: ' + ('\n' + ind).join(syn.examples)
print
info('near')
Output:
sense 1 (verb)
definition: move towards
synonyms: approach, near, come on, go up, draw near, draw close, come near
examples: We were approaching our destination
They are drawing near
The enemy army came nearer and nearer
sense 2 (adjective)
definition: not far distant in time or space or degree or circumstances
synonyms: near, close, nigh
antonyms: far
examples: near neighbors
in the near future
they are near equals
...
这篇关于完全可理解的词典/词典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!