POS标记(德语) [英] POS tagging in German

查看:187
本文介绍了POS标记(德语)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用NLTK从以下命令开头的文本字符串中提取名词:

I am using NLTK to extract nouns from a text-string starting with the following command:

tagged_text = nltk.pos_tag(nltk.Text(nltk.word_tokenize(some_string)))

它在英语中工作正常. 是否有一种简便的方法使其也适用于德语?

It works fine in English. Is there an easy way to make it work for German as well?

(我没有自然语言编程的经验,但是我设法使用了到目前为止很棒的python nltk库.)

(I have no experience with natural language programming, but I managed to use the python nltk library which is great so far.)

推荐答案

自然语言软件通过利用语料库及其提供的统计数据来发挥其魔力.您需要告诉nltk一些德语语料库,以帮助它正确地标记德语.我相信 EUROPARL 语料库可能有助于您前进.

Natural language software does its magic by leveraging corpora and the statistics they provide. You'll need to tell nltk about some German corpus to help it tokenize German correctly. I believe the EUROPARL corpus might help get you going.

请参见 nltk.corpus.europarl_raw 此答案作为示例配置.

See nltk.corpus.europarl_raw and this answer for example configuration.

另外,请考虑使用"nlp"标记该问题.

Also, consider tagging this question with "nlp".

这篇关于POS标记(德语)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆