开源的基于规则的模式匹配/信息提取框架? [英] Open-source rule-based pattern matching / information extraction frameworks?

查看:339
本文介绍了开源的基于规则的模式匹配/信息提取框架?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在购买一个开放源代码框架,用于编写自然语言语法规则以对注释进行模式匹配.您可以将其视为正则表达式,但在令牌而不是字符级别进行匹配.这种框架应使匹配标准能够引用附加到输入标记或跨度的其他属性,并在操作中修改此类属性.

I'm shopping for an open-source framework for writing natural language grammar rules for pattern matching over annotations. You could think of it like regexps but matching at the token rather than character level. Such a framework should enable the match criteria to reference other attributes attached to the input tokens or spans, as well as modify such attributes in an action.

我知道有三个选项符合以下描述:

There are three options I know of which fit this description:

  • GATE Java Expressions over Annotations (JAPE)
  • Stanford CoreNLP's TokensRegex
  • UIMA Ruta (Tutorial)
  • Graph Expression (GExp)*

目前还有其他可用的选项吗?

相关工具

  • 虽然我知道像 Antlr 这样的通用解析器生成器也可以满足此目的,但我正在寻找一些东西专门针对自然语言处理或信息提取量身定制的语言.
  • UIMA 包括
  • While I know that general parser generators like Antlr can also serve this purpose, I'm looking for something which are more specifically tailored for natural language processing or information extraction.
  • UIMA includes a Regex Annotator plugin for declaring rules in XML, but appears to operate at the character rather than high-level objects.
  • I know that this kind of task is often performed with statistical models, but for narrow, structured domains there's benefit in hand-crafting rules.

*使用GExp时,规则"实际上是在代码中实现的,但是由于选项很少,所以我选择将其包括在内.

* With GExp 'rules' are actually implemented in code but since there are so few options I chose to include it.

推荐答案

巴黎东大学的法语学术软件Unitex也符合您的描述(

French academic soft Unitex from University Paris East also matches your description (http://www-igm.univ-mlv.fr/~unitex/)

它基于C ++,包含许多可选的预处理规则和适用于20多种语言的词典.

It's C++ based, comprises many optional preprocessing rules and lexicons for 20+ languages.

GUI是基于图的(您可以设计自动机,即语法").

The GUI is graph based (you design automata ie 'grammars').

这篇关于开源的基于规则的模式匹配/信息提取框架?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆