NLTK中的pos_tag无法正确标记句子 [英] pos_tag in NLTK does not tag sentences correctly

查看：255 发布时间：2020/5/18 1:22:29 nltk

本文介绍了NLTK中的pos_tag无法正确标记句子的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用了以下代码:

# Step 1 : TOKENIZE
from nltk.tokenize import *
words = word_tokenize(text)

# Step 2 : POS DISAMBIG
from nltk.tag import *
tags = pos_tag(words)

标记两个句子: 约翰很好.约翰很好吗?

第一句中的John是NN，第二句中的是VB！那么，如何在不训练退避标记器的情况下纠正pos_tag函数呢?

修改后的问题:

我在这里看到了NLTK标记器的演示 http://text-processing.com/demo/标签/.当我尝试使用选项"English Taggers& Chunckers:Treebank"或"Brown Tagger"时，我得到了正确的标签.那么，如何在不经过培训的情况下例如使用Brown Tagger?

解决方案

简短的回答:不能.答案略长:您可以使用手动创建的UnigramTagger覆盖特定单词.有关此方法的详细信息，请参见我对使用nltk进行自定义标记的答案.

I have used this code:

# Step 1 : TOKENIZE
from nltk.tokenize import *
words = word_tokenize(text)

# Step 2 : POS DISAMBIG
from nltk.tag import *
tags = pos_tag(words)

to tag two sentences: John is very nice. Is John very nice?

John in the first sentence was NN while in the second was VB! So, how can we correct pos_tag function without training back-off taggers?

Modified question:

I have seen the demonstration of NLTK taggers here http://text-processing.com/demo/tag/. When I tried the option "English Taggers & Chunckers: Treebank" or "Brown Tagger", I get the correct tags. So how to use Brown Tagger for example without training it?

解决方案

Short answer: you can't. Slightly longer answer: you can override specific words using a manually created UnigramTagger. See my answer for custom tagging with nltk for details on this method.

这篇关于NLTK中的pos_tag无法正确标记句子的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

NLTK中的pos_tag无法正确标记句子 [英] pos_tag in NLTK does not tag sentences correctly

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

NLTK中的pos_tag无法正确标记句子 [英] pos_tag in NLTK does not tag sentences correctly

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭