如何使用nltk.Regexp.parser()解析自定义标签 [英] How to parse custom tags using nltk.Regexp.parser()

查看:643
本文介绍了如何使用nltk.Regexp.parser()解析自定义标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题类似于未解决的问题:为NLTK使用自定义POS标签块化?,但是我得到的错误却有所不同.我正在尝试解析添加了我自己的域特定标签的句子.

My question is similar to this unanswered question: Using custom POS tags for NLTK chunking?, but the error I am getting is different. I am trying to parse a sentence to which I have added my own domain specific tags.

例如:

(u'greatest', 'P'), (u'internet', 'NN'), (u'ever', 'A'), 
(u',', ','), (u'and', 'CC'), (u'its', 'PRP$'), (u'being', 'VBG'), 
(u'slow', 'N'), (u'as', 'IN'), (u'hell', 'NN')`

其中(u'slow', 'N')是自定义标记'N'.

我正在尝试使用以下方法对此进行解析:

I am trying to parse this using the following:

grammar=r"""
Chunk:`{<A>?*<P>+}`
"""
parser=nltk.RegexpParser(grammar)

但是我遇到以下错误:

ValueError: Illegal chunk pattern: `{<A>?*<P>+}`

nltk.RegexpParser是否处理自定义标签?还有其他基于nltk或python的解析器可以做到这一点吗?

Does nltk.RegexpParser process custom tags? Is there any other nltk or python based parser which can do that?

推荐答案

nltk.RegexpParser 可以处理自定义标签.

nltk.RegexpParser can process custom tags.

您可以在此处修改代码以使其正常工作:

Here is how you can modify your code to work:

# Import the RegexpParser
from nltk.chunk import RegexpParser

# Define your custom tagged data. 
tags = [(u'greatest', 'P'), (u'internet', 'NN'), (u'ever', 'A'), 
(u',', ','), (u'and', 'CC'), (u'its', 'PRP$'), (u'being', 'VBG'), 
(u'slow', 'N'), (u'as', 'IN'), (u'hell', 'NN')]

# Define your custom grammar (modified to be a valid regex).
grammar = """ CHUNK: {<A>*<P>+} """

# Create an instance of your custom parser.
custom_tag_parser = RegexpParser(grammar)

# Parse!
custom_tag_parser.parse(tags)

这是您获得测试数据的结果:

This is the result you would get for your test data:

Tree('S', [Tree('CHUNK', [(u'greatest', 'P')]), (u'internet', 'NN'), (u'ever', 'A'), (u',', ','), (u'and', 'CC'), (u'its', 'PRP$'), (u'being', 'VBG'), (u'slow', 'N'), (u'as', 'IN'), (u'hell', 'NN')])

这篇关于如何使用nltk.Regexp.parser()解析自定义标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆