高效的上下文无关语法解析器，最好是 Python 友好的 [英] Efficient Context-Free Grammar parser, preferably Python-friendly

查看：36 发布时间：2022/1/2 17:41:07 python parsing nlp grammar nltk

本文介绍了高效的上下文无关语法解析器，最好是 Python 友好的的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要为我的一个项目解析一小部分英语，描述为具有(1 级)特征结构的上下文无关语法(example)，我需要有效地做到这一点.

I am in need of parsing a small subset of English for one of my project, described as a context-free grammar with (1-level) feature structures (example) and I need to do it efficiently .

现在我正在使用 NLTK 的解析器，它产生正确的输出但速度很慢.对于我的约 450 个相当模糊的非词典规则和 50 万个词汇条目的语法，解析简单的句子可能需要 2 到 30 秒，这取决于结果树的数量.词法条目对性能几乎没有影响.

Right now I'm using NLTK's parser which produces the right output but is very slow. For my grammar of ~450 fairly ambiguous non-lexicon rules and half a million lexical entries, parsing simple sentences can take anywhere from 2 to 30 seconds, depending it seems on the number of resulting trees. Lexical entries have little to no effect on performance.

另一个问题是在开头加载 (25MB) 语法+词典可能需要一分钟.

Another problem is that loading the (25MB) grammar+lexicon at the beginning can take up to a minute.

从我在文献中可以找到的信息来看，用于解析这种语法(Earley 或 CKY)的算法的运行时间应该与语法的大小成线性关系，与输入标记列表的大小成三次方.我使用 NLTK 的经验表明，歧义是对性能影响最大的，而不是语法的绝对大小.

From what I can find in literature, the running time of the algorithm used to parse such a grammar (Earley or CKY) should be linear to the size of the grammar and cubic to the size of the input token list. My experience with NLTK indicates that ambiguity is what hurts the performance most, not the absolute size of the grammar.

所以现在我正在寻找一个 CFG 解析器来替换 NLTK.我一直在考虑 PLY 但我不知道它是否支持 CFG 中的特征结构，这在我的情况下是必需的，我看到的例子似乎在做很多程序解析而不是仅仅指定一个语法.任何人都可以向我展示一个支持功能结构和使用声明性语法的 PLY 示例吗?

So now I'm looking for a CFG parser to replace NLTK. I've been considering PLY but I can't tell whether it supports feature structures in CFGs, which are required in my case, and the examples I've seen seem to be doing a lot of procedural parsing rather than just specifying a grammar. Can anybody show me an example of PLY both supporting feature structs and using a declarative grammar?

我也可以使用任何其他可以高效完成我需要的解析器.Python 接口更可取，但并非绝对必要.

I'm also fine with any other parser that can do what I need efficiently. A Python interface is preferable but not absolutely necessary.

高效的上下文无关语法解析器，最好是 Python 友好的 [英] Efficient Context-Free Grammar parser, preferably Python-friendly

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

高效的上下文无关语法解析器，最好是 Python 友好的 [英] Efficient Context-Free Grammar parser, preferably Python-friendly

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭