在Java中使用MaltParser解析原始文本 [英] Parse raw text with MaltParser in Java

查看:101
本文介绍了在Java中使用MaltParser解析原始文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现python中的NLKT是通过* raw_parse *函数实现的,但是我需要使用Java.我发现cleartk具有MaltParser包装器,但是没有有关它的文档.我正在寻找一个函数或项目,该函数或项目首先将原始英语文本转换为MaltParser可以使用并用MaltParser解析的conll文件.任何帮助表示赞赏.

I found that NLKT in python does it via *raw_parse* function but I need to use Java. I found cleartk has a MaltParser wrapper but there is no documentation about it. I'm looking for a function or a project that first converts raw English text to conll file that MaltParser can use and parses it with MaltParser. Any help is appreciated.

推荐答案

在文件夹 examples/apiexamples/srcex 中,MaltParser 1.7.2发行版附带了一些示例.

There are examples coming with the MaltParser 1.7.2 distribution in the folder examples/apiexamples/srcex.

但是,这些示例仅显示在已执行标记化和pos标记之后(以及这些步骤的输出已转换为类似CONLL的格式之后)如何以编程方式运行MaltParser.

However, these examples only show how to run the MaltParser programmatically after tokenization and pos-tagging have already been performed (and after the output of these steps has been converted to a CONLL-like format).

由于我目前无法提供更好(更简单/更简短)的替代方法,至少我可以与您分享一个

Since I currently cannot offer a better (simpler/shorter) alternative, at least I could share with you a link to a Groovy script which performs tokenization, part-of-speech tagging (using OpenNLP) and dependency parsing (using MaltParser). The tools are made interoperable using UIMA. If one is familiar with Maven, it should be quite straight forward to derive a Java version of that script.

请记住,这不是最好的答案,但在这一点上可能总比没有好.

Mind, this is not the best answer, but at this point possibly better than nothing.

注意:我是Apache UIMA和DKPro Core(链接指向的项目)的开发人员.

Note: I'm a developer on both, Apache UIMA and DKPro Core (the project to which the link points).

这篇关于在Java中使用MaltParser解析原始文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆