在Python中使用阿拉伯语WordNet作为同义词? [英] Using Arabic WordNet for synonyms in python?

查看:156
本文介绍了在Python中使用阿拉伯语WordNet作为同义词?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取句子中阿拉伯语单词的同义词

I am trying to get the synonyms for arabic words in a sentence

如果该单词是英语,它会很好地工作,并且结果以阿拉伯语显示,我想知道是否有可能立即获得阿拉伯语单词的同义词,而无需先用英语编写.

If the word is in English it works perfectly, and the results are displayed in Arabic language, I was wondering if its possible to get the synonym of an Arabic word right away without writing it in english first.

我尝试过,但是没有用&我希望没有tashkeelانتظار而不是اِنْتِظار

I tried that but it didn't work & I would prefer without tashkeel انتظار instead of اِنْتِظار

from nltk.corpus import wordnet as omw
jan = omw.synsets('انتظار ')[0]
print(jan)
print(jan.lemma_names(lang='arb'))

推荐答案

nltk中使用的Wordnet不支持阿拉伯语.如果您要查找阿拉伯语Wordnet ,那么这是完全不同的事情

Wordnet used in nltk doesnt support arabic. If you are looking for Arabic Wordnet so this is a totally different thing.

对于阿拉伯语wordnet,请下载:

For Arabic wordnet, download:

  • http://nlp.lsi.upc.edu/awn/get_bd.php
  • http://nlp.lsi.upc.edu/awn/AWNDatabaseManagement.py.gz

您通过以下方式运行它:

You run it with:

$ python AWNDatabaseManagement.py -i upc_db.xml


现在可以得到类似wn.synset('إنتظار')的信息.阿拉伯语Wordnet具有功能wn.get_synsets_from_word(word),但它提供了偏移量.它也只接受数据库中发声的单词.例如,对于جميل,您应该使用جَمِيل:


Now to get something like wn.synset('إنتظار'). Arabic Wordnet has a function wn.get_synsets_from_word(word), but it gives offsets. Also it accepts the words only as vocalized in the database. For example, you should use جَمِيل for جميل:

>> wn.get_synsets_from_word(u"جَمِيل")
[(u'a', u'300218842')]

300218842是جميل的同义词集的偏移量.

300218842 is the offset of the synset of جميل .

我检查了单词إنتظار,看来它在AWN中不存在.

I checked for the word إنتظار and seems it doesn't exist in AWN.

有关使用AWN获取同义词的更多详细信息,此处.

More details about using AWN to get synonyms here.

这篇关于在Python中使用阿拉伯语WordNet作为同义词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆