时间抽取(即从自由表单文本中提取日期/时间实体) - 如何? [英] Temporal Extraction (i.e. Extract date/time entities from free form text) - How?
问题描述
我只是在寻找有效80%的东西。没有必要捕捉2009年1月以后的月份,但是基本的常用日期实体会很好。
我对所有的建议开放,甚至是幻想正则表达式。
消失!
(感谢 - 亨利)
-
如果数据中的目标时间表达式格式有限,请使用正则表达式和迭代方法来优化系统
-
否则,请使用Stanford NLP工具包, SUTime ,这可能是过度杀人,但绝对符合您的要求
Has anyone found a simple, but effective way to extract date references from text? I've done a fair amount of searching for temporal extraction tools, but there isn't a lot out there. There are a few white papers, but it seems to fall into a subset of the whole semantic web thingy but not given much attention.
I'm just looking for something that is 80% effective. There is no need to capture things like "the month after Jan 2009", but basic common dates entities would be nice.
I'm open to all suggestions, even fancy regex expressions.
Fire away!
(and thanks - Henry)
If the target temporal expressions in your data are only in limited format, use regular expression and iterative approach to refine your system
Otherwise, use Stanford NLP toolkit, SUTime, which might be an over-kill but definitely meet your demands
这篇关于时间抽取(即从自由表单文本中提取日期/时间实体) - 如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!