Solr日期正则表达式查询 [英] Solr Date Regex Query

查看:227
本文介绍了Solr日期正则表达式查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用solr的正则表达式功能来查询日期字段。
我试图做一个简单的查询,如下所示,但我得到0结果,没有错误。
??q = DATE:/ 200 [0-9] -03-30T11\:58\:40Z /& fl = DATE

I want to use solr's regular expression capabilities to query a date field. I'm trying to make a simple query like the following, but I get 0 results and no errors. ...?q=DATE:/200[0-9]-03-30T11\:58\:40Z/&fl=DATE

以下是一些示例输出:

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
<lst name="params">
<str name="fl">id,date</str>
<str name="q">date:/.*:.*/</str>
</lst>
</lst>
<result name="response" numFound="39" start="0">
<doc>
<str name="id">1362932537549-A17C9685</str>
<date name="date">2012-10-31T14:57:53Z</date>
</doc>
<doc>
<str name="id">1362932537549-AD280D59</str>
<date name="date">2012-10-25T09:57:53Z</date>
</doc>
<doc>
<str name="id">1362932537549-B091BE97</str>
<date name="date">2012-10-23T09:57:53Z</date>
</doc>
<doc>
<str name="id">1362932537549-B0D8341C</str>
<date name="date">2012-10-22T14:57:53Z</date>
</doc>
<doc>
<str name="id">1362932537549-40083ADB</str>
<date name="date">2010-08-12T14:33:00Z</date>
</doc>
<doc>
<str name="id">1362932537549-9CA68015</str>
<date name="date">2011-07-20T12:25:02Z</date>
</doc>
...

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">8380</int>
<lst name="params">
<str name="fl">id,date</str>
<str name="q">date:/.*.*/</str>
</lst>
</lst>
<result name="response" numFound="1263" start="0">
<doc>
<str name="id">1362932537549-5A0DAFB7</str>
<date name="date">2010-08-12T14:31:00Z</date>
</doc>
<doc>
<str name="id">1362932537549-D712F1C71</str>
<date name="date">2011-12-01T13:23:53Z</date>
</doc>
<doc>
<str name="id">1362932537549-3FAA6BC</str>
<date name="date">2012-05-25T14:26:08Z</date>
</doc>
<doc>
<str name="id">1362932537549-C8A6B81F</str>
<date name="date">2010-08-12T14:25:00Z</date>
</doc>
<doc>
<str name="id">1362932537549-D712F1C8</str>
<date name="date">2011-12-01T13:23:53Z</date>
</doc>
...

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">34443</int>
<lst name="params">
<str name="fl">id,date</str>
<str name="q">date:/.*0.*/</str>
</lst>
</lst>
<result name="response" numFound="65" start="0">
<doc>
<str name="id">1362932537549-A4BC013G</str>
<date name="date">2012-10-29T17:57:53Z</date>
</doc>
<doc>
<str name="id">1362932537549-862F708G</str>
<date name="date">2013-02-14T09:48:46Z</date>
</doc>
<doc>
<str name="id">1362932537549-B8A38A74</str>
<date name="date">2013-02-14T09:49:18Z</date>
</doc>
<doc>
<str name="id">1362932537549-D4BA90CD</str>
<date name="date">2007-10-09T21:53:34Z</date>
</doc>
<doc>
<str name="id">1362932537549-3028513F</str>
<date name="date">2011-06-24T20:30:22Z</date>
</doc>


推荐答案

你的正则表达式看起来不错, ,尝试使用URL编码值:

Your regex looks okay, but instead of escaping the colons, try URL-encoding the value:

?q=DATE%3A%2F200%5B0-9%5D-03-30T11%5C%3A58%5C%3A40Z%2F&fl=DATE






(从问题的评论中移除。)


(Migrated from a comment on the question.)

似乎无法直接直接正则表达式日期字段。

如您所见,即使以下查询 date:/.*_.*/ date:/.*,.*/ date:/.* A。* / 返回结果,即使时间戳清楚没有这些字符。我想发生的是, date 不是一个字符串字段,因此当您查询,您实际上发现发生 的结果 eg 二进制)数据。 (按照外行人的想法,想象在记事本中打开二进制数据(如可执行文件)并搜索ASCII字符。)

As you found, even the following queries date:/.*_.*/, date:/.*,.*/, and date:/.*A.*/ return results, even though timestamps clearly have none of those characters. I think what's happening is that date is not a string field, therefore when you query for a character like :, you're actually finding results that happen to have that character amongst encoded (e.g. raw binary) data. (In layman's terms, imagine opening up binary data (like an executable file) in Notepad and searching for an ASCII character.)

这也解释了为什么你会得到对于所有这些查询,相同数量的结果为20到30:统计学上,二进制(和其他编码)数据之间的随机ASCII字符的正则表达式应返回大致相同的结果频率。

This also explains why you're getting about the same number of results, 20 to 30, for all those queries: statistically speaking, regexing for a random ASCII character amongst binary (and other encoded) data should return about the same frequency of results.

这篇关于Solr日期正则表达式查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆