Datefinder Module特定字符串上的陌生人行为 [英] Datefinder Module Stranger behavior on particular string
问题描述
我有两个字符串:
s1 = 'Agreement dated on March 9, 2007'
s2 = 'Agreement signed on March 9, 2007'
我在字符串上运行以下代码
I run below code on the string
import datefinder
matches =datefinder.find_dates(s1) ## or s2
for match in matches:
print (match)
s2
给了我想要的结果,但s1
却没有,因为它包含带日期的单词.
s2
gives me desired result but s1
doesn't as it contains the word dated.
P.S.我使用过日期查找器,因为我有多种日期格式,因此需要编写多个正则表达式.只是针对这种例外情况,效果很好
P.S. I have used datefinder as I had multiple date formats and hence need to write multiple regex. This worked well just for this exception
知道为什么会有这种奇怪的行为吗?
Any idea why this strange behavior?
推荐答案
这不是datefinder
中的错误.
日期 是正则表达式代码中的令牌模式之一:
dated is one of the token patterns in the regex code:
EXTRA_TOKENS_PATTERN = r由于|在|上|在|标准|白天|节约|时间|日期| 已 |在|到|通过|之间|直到|当天"
EXTRA_TOKENS_PATTERN = r"due|by|on|during|standard|daylight|savings|time|date|dated|of|to|through|between|until|at|day"
您可能需要编写一个正则表达式来传递问题或成为解决该问题的项目的参与者.
You will likely need to write a regex to by pass the issue or become a contributor to the project to fix the issue.
https://github.com/akoumjian/datefinder
这篇关于Datefinder Module特定字符串上的陌生人行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!