NLTK在命令语中将动词识别为名词 [英] NLTK identifies verb as Noun in Imperatives

查看:191
本文介绍了NLTK在命令语中将动词识别为名词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在如下使用NLTK POS标记器

I am using NLTK POS tagger as below

sent1='get me now'
sent2='run fast'
tags=pos_tag(word_tokenize(sent2))
print tags
[('run', 'NN'), ('fast', 'VBD')]

我发现了类似的帖子 NLTK认为命令式是名词,它建议将单词作为动词添加到字典中. 问题是我有太多这样的未知词. 但是我有一个线索,它们总是出现在词组的开头.

I find similar posts NLTK Thinks that Imperatives are Nouns which suggest to add the word to a dictionary as a verb. Problem is I have too many such unknown words. But one clue I have, they always appear at the start of a phrase.

例如:立即下载",立即预订",注册"

Eg: 'Download now', 'Book it now', 'Sign up'

我如何正确地协助NLTK产生正确的结果

How can i correctly assist the NLTK to produce correct result

推荐答案

还可以在NLTK中加载其他第三方模型.看看 Python NLTK pos_tag not返回正确的词性标签

There are other third-party models that you can load in NLTK. Take a look at Python NLTK pos_tag not returning the correct part-of-speech tag

要用一些技巧回答问题,您可以通过添加代词来欺骗POS标记,以便动词获得主语,例如

To answer the question with some hacks, you can trick the POS tagger by adding a pronoun so that the verb gets a subject, e.g.

>>> from nltk import pos_tag
>>> sent1 = 'get me now'.split()
>>> sent2 = 'run fast'.split()
>>> pos_tag(['He'] + sent1)
[('He', 'PRP'), ('get', 'VBD'), ('me', 'PRP'), ('now', 'RB')]
>>> pos_tag(['He'] + sent1)[1:]
[('get', 'VBD'), ('me', 'PRP'), ('now', 'RB')]

功能化答案:

>>> from nltk import pos_tag
>>> sent1 = 'get me now'.split()
>>> sent2 = 'run fast'.split()
>>> def imperative_pos_tag(sent):
...     return pos_tag(['He']+sent)[1:]
... 
>>> imperative_pos_tag(sent1)
[('get', 'VBD'), ('me', 'PRP'), ('now', 'RB')]
>>> imperative_pos_tag(sent2)
[('run', 'VBP'), ('fast', 'RB')]


如果您希望命令中的所有动词都接收基本形式的VB标签:


If you want all verbs in your imperative to receive base form VB tag:

>>> from nltk import pos_tag
>>> sent1 = 'get me now'.split()
>>> sent2 = 'run fast'.split()
>>> def imperative_pos_tag(sent):
...     return [(word, tag[:2]) if tag.startswith('VB') else (word,tag) for word, tag in pos_tag(['He']+sent)[1:]]
... 
>>> imperative_pos_tag(sent1)
[('get', 'VB'), ('me', 'PRP'), ('now', 'RB')]
>>> imperative_pos_tag(sent2)
[('run', 'VB'), ('fast', 'RB')]

这篇关于NLTK在命令语中将动词识别为名词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆