使用NLTK和德国语料库从名词中获取性别 [英] Get gender from noun using NLTK with German corpora

查看:96
本文介绍了使用NLTK和德国语料库从名词中获取性别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试NTLK.我的问题是图书馆是否可以检测德语中名词的性别.我想接收此信息,以确定文本是否写成性别中立.浏览此处获取更多信息: https://en.wikipedia.org/wiki/Gender_neutrality_in_languages_with_grammatical_gender

I'm experimenting with NTLK. My question is if the library can detect the gender of a noun in German. I want to receive this information in order to determine if a text is written gender neutral. See here for more information: https://en.wikipedia.org/wiki/Gender_neutrality_in_languages_with_grammatical_gender

底层代码对​​我的句子进行了分类,但是我看不到有关"Mitarbeiter" 性别的任何信息.到目前为止,我的代码:

The underlying code categorizes my sentence, but I can't see any information about the gender of "Mitarbeiter". My code so far:

sentence = """Der Mitarbeiter geht."""
tokens = nltk.word_tokenize(sentence)
tagged = nltk.pos_tag(tokens)
>>> tagged[0:6]

到目前为止,我还没有找到可以完成此任务的工具或脚本.也许还有更好的解决方案可以解决我的任务.

I haven't found any tools or scripts which accomplish this so far. Maybe there's also a better solution for my task.

推荐答案

我不认为NLTK可以为德语提供现成的功能.但是,有免费的德语形态标记器可以为您做到这一点,例如RFTagger:

I don't believe NLTK can do that out of the box for German. However, there are freely available morphological taggers for German which can do that for you, for example RFTagger:

http://www.cis.uni-muenchen.de /〜schmid/tools/RFTagger/

它给出这样的输出:

Das     PRO.Dem.Subst.-3.Nom.Sg.Neut 
ist     VFIN.Sein.3.Sg.Pres.Ind 
ein     ART.Indef.Nom.Sg.Masc 
Testsatz    N.Reg.Nom.Sg.Masc 
.   SYM.Pun.Sent 

但是它不是在Python中,因此您必须使用子进程来调用它.另一种选择是获取带有为德国性别标记的名词的语料库,例如Tiger语料库:

However it is not in Python, so you would have to call it using subprocess. Another option would be to obtain a corpus with nouns tagged for German gender, such as the Tiger corpus:

http://www.ims.uni -stuttgart.de/forschung/ressourcen/korpora/tiger.en.html

并训练NLTK识别性别,但是我希望RFTagger是一种更快/更准确的解决方案.

and train NLTK to recognize the genders, but I would expect RFTagger is a quicker/more accurate solution.

这篇关于使用NLTK和德国语料库从名词中获取性别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆