通用日期从非结构化文本中解析库 [英] Generic Date Parsing Library from unstructured text

查看:89
本文介绍了通用日期从非结构化文本中解析库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以建议Java中的任何能够从非结构化数据解析日期/时间日历事件的库。
示例




  • 今天晚上10点开始!星期日feb 10th => 10 / Feb / 2013 10pm

  • 明天(feb 10th)=> 10 / Feb / 2013

  • 星期二2月10日r\\\
    with每日筛选,直到2月16日



等等



输入数据来自用户,所以他可以随意格式输入数据。
我开始识别所有可能的标记,并进行正则表达式匹配来对所有令牌进行短语。
我想知道有人可以建议Java中的一些Library,这可能实际上有助于解析。



我经历了SO的其他帖子,但他们似乎建议技巧,我想知道有人有图书馆。



谢谢

解决方案

UTAH( https://github.com/sonalake/utah-parser )能够处理非结构化文本到地图的一般解析。一旦你这样做,你应该能够把它放入格式化程序。


Can somebody suggest any Library in Java which is capable of parsing Date/Time Calendar Event from Unstructured Data. Example

  • Starts 10pm Tonight! Sunday feb 10th => 10/Feb/2013 10pm
  • tomorrow (feb 10th) => 10/Feb/2013
  • Sunday Feb 10\r\nwith daily screenings till Feb 16th

and so on

The input data comes from user, so he may enter data in any random format. I started of identifying all the possible token and do a regex match to phrase all tokens. I wonder if someone can suggest some Library in Java, which might actually help in parsing.

I ran through other post on SO, but they seem to suggest techniques, i wonder if somebody has a library.

Thanks

解决方案

UTAH (https://github.com/sonalake/utah-parser) is able to handle generic parsing of unstructured text into maps. Once you've done that you should be able to throw that into a formatter.

这篇关于通用日期从非结构化文本中解析库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆