使用自定义POS标签进行NLTK分块? [英] Using custom POS tags for NLTK chunking?

查看:138
本文介绍了使用自定义POS标签进行NLTK分块?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在NLTK中进行分块语法时,可以使用非标准的语音标签吗?例如,我要解析以下句子:

Is it possible to use non-standard part of speech tags when making a grammar for chunking in the NLTK? For example, I have the following sentence to parse:

complication/patf associated/qlco with/prep breast/noun surgery/diap
independent/adj of/prep the/det use/inpr of/prep surgical/diap device/medd ./pd

通过诸如"medd"或"diap"之类的专用标签可以极大地帮助我从文本中找到需要的短语.我以为,因为可以使用RegEx进行解析,所以它与其他任何东西都无关,但是当我尝试运行以下代码时,出现错误:

Locating the phrases I need from the text is greatly assisted by specialized tags such as "medd" or "diap". I thought that because you can use RegEx for parsing, it would be independent of anything else, but when I try to run the following code, I get an error:

grammar = r'TEST: {<diap>}'
cp = nltk.RegexpParser(grammar)
cp.parse(sentence)

ValueError: Transformation generated invalid chunkstring:
<patf><qlco><prep><noun>{<diap>}<adj><prep><det><inpr><prep>{<diap>}<medd><pd>

我认为这与标签本身有关,因为NLTK不能从标签生成树,但是是否可以跳过该部分并仅获取返回的分块项目?也许NLTK并不是最好的工具,如果是的话,有人可以推荐另一个模块来对文本进行分块吗?

I think this has to do with the tags themselves, because the NLTK can't generate a tree from them, but is it possible to skip that part and just get the chunked items returned? Maybe the NLTK isn't the best tool, and if so, can anyone recommend another module for chunking text?

我正在使用Anaconda发行版在python 2.7.6中进行开发.

I'm developing in python 2.7.6 with the Anaconda distribution.

提前谢谢!

推荐答案

是的,可以将自定义标签用于NLTK分块.我用过一样的. 请参阅:如何使用nltk.Regexp.parser()解析自定义标签

Yes it is possible to use custom tags for NLTK chunking. I have used the same. Refer: How to parse custom tags using nltk.Regexp.parser()

ValueError和错误描述表明语法形成中存在错误,您需要检查该错误.您可以使用相同的答案来更新答案,以获取更正建议.

The ValueError and the error description suggest that there is an error in the formation of your grammar and you need to check that. You can update the answer with the same for suggestions on corrections.

这篇关于使用自定义POS标签进行NLTK分块?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆