时间和日期的OpenNLP名称实体识别模型 [英] OpenNLP Name entity recognition model for time and date

查看:468
本文介绍了时间和日期的OpenNLP名称实体识别模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用OpenNLP模型进行名称实体识别.

I am using OpenNLP models for Name-entity recognition.

我正在通过句子,我想在其中识别单词. Open NLP需要一个String []变量,因此我将我的String拆分成用空格分隔的单词.

I am passing sentences, in which I want to identify words. Open NLP requires a String [] variable, hence I split my String into words separated by space.

我面临识别日期的问题.例如,如果字符串包含日期:2012年1月7日,我将字符串拆分为单词,则将"7","Jan"和"2012"分成3个不同的单词.尽管它们被识别为日期,但是这3个不同的令牌对我来说没有意义,无法进行进一步处理. 我怎么可能分割我的字符串,以便可以将"2012年1月2日"作为一个字符串... 2012年1月7日是一种格式...有时也是2012年1月7日.日期还可以识别我输入的时间格式:例如12:18 pm

I am facing the problem to recognize the Date. If for example the string contains the date: 7 Jan 2012 and I split the string into words, "7", "Jan" and "2012" get separated as 3 different words. Although they are recognized as dates but the 3 different tokens don't make sense for me for further processing. How can I possibly split my string, so that "2 Jan 2012" can be taken as one string... 7 Jan 2012 is one format... Sometimes it is also Jan 7,2012. Date also recognizes the time format I input: like 12:18pm

NER时间模型不能在12:18 pm或09:52:52识别时间..它接受哪种时间格式?

The NER time model is does not recognize the time in 12:18pm or 09:52:52 .. What kind of time format does it accept?

推荐答案

Apache OpenNLP日期和时间模型是统计的,由语料库训练而成.它会从上下文中识别日期和时间,而不仅是从格式中识别.

Apache OpenNLP date and time model are statistical, trained from a corpus. It will recognize date and time from the context, not only from the format.

如果有特殊需要,您可以创建自己的语料库并训练您自己的OpenNLP名称 Finder模型.

If you have specific needs you can create your own corpus and train your own OpenNLP Name Finder model.

OpenNLP名称查找器还支持在培训时进行一些自定义.也许如果您创建语料库,并且还添加一些基于正则表达式的功能,您可以改善结果.

OpenNLP Name Finder also supports some customization while training. Maybe if you create a corpus, and also add some regex based features you can improve your results.

这篇关于时间和日期的OpenNLP名称实体识别模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆