nltk无法解释斯坦福解析器输出的语法类别PRP $ [英] nltk cant interpret grammar category PRP$ output by stanford parser

查看:131
本文介绍了nltk无法解释斯坦福解析器输出的语法类别PRP $的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从斯坦福解析器检索的语法中生成句子,但是NLTK无法解释PRP $.

I want to generate sentence from grammar retrived from stanford parser, but NLTK is not able to interpret PRP$.

from nltk.parse.stanford import StanfordParser
from nltk.grammar import CFG
from nltk.parse.generate import generate

sp=StanfordParser(model_path='/home/aman/stanford_resource/stanford-parser-full-2014-06-16/stanford-parser-3.4-models/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz',path_to_jar='/home/aman/stanford_resource/stanford-parser-full-2014-06-16/stanford-parser.jar',path_to_models_jar='/home/aman/stanford_resource/stanford-postagger-full-2014-08-27/stanford-postagger-3.4.1.jar')
sent1='He killed the tiger in his pants'
parse_result=sp.raw_parse(sent1)
grammar_list=[]

for p in parse_result:
    l=p.productions()
    grammar_string='\n'.join(map(str,l))
    grammar=CFG.fromstring(grammar_string)
    #grammar_list.append(grammar)
    #for s in generate(grammar,n=3):
    #    print s




ValueError: Unable to parse line 11: NP -> PRP$ NNS
Expected a nonterminal, found: $ NNS

它如何工作.我应该专门指导nltk这些语法类别吗.

how can it be worked.Should i specifically instruct nltk for these grammar categories.

推荐答案

ValueError:无法解析第11行:NP-> PRP $ NNS
预期为非终结符,找到:$ NNS

ValueError: Unable to parse line 11: NP -> PRP$ NNS
Expected a nonterminal, found: $ NNS

我不知道为什么要尝试将手工构建的CFG与斯坦福解析器的输出结合起来,但这是解决此问题的方法:

I've no idea why you are trying to combine a hand-built CFG with the output of the Stanford parser, but here's a solution to this problem:

我对nltk/grammar.py的快速检查显示$不是非终端名称的合法字符.可以通过对模块进行如下修补来轻松纠正:

I quick inspection of nltk/grammar.py shows that $ is not a legal character for a non-terminal name. This can be easily corrected by patching the module like this:

import nltk
import re
nltk.grammar._STANDARD_NONTERM_RE = re.compile('( [\w/][\w$/^<>-]* ) \s*', re.VERBOSE)

在上面,我只是将$添加到了在nltk/grammar.py中可以找到的正则表达式中.然后,您可以创建和使用在其作品中具有$的语法:

In the above I just added $ to the regexp that you'll find in nltk/grammar.py. You can then create and use grammars that have $ in their productions:

grammar = nltk.grammar.CFG.fromstring("NP -> PRP$ NNS")

这篇关于nltk无法解释斯坦福解析器输出的语法类别PRP $的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆