Elasticsearch正则表达式查询 [英] Elasticsearch Regex Query

查看:106
本文介绍了Elasticsearch正则表达式查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行elasticsearch v1.1.1,但无法从正则表达式搜索中获取结果.

I am running elasticsearch v1.1.1 and I am having trouble getting results from regex searches.

{
 "query" : {
   "regexp" : {
           "lastname" : "smit*"
    }
  }
}

返回0个结果(当我知道我的数据中有史密斯"时.)

Returns 0 results (when I know I have 'smith' in the data.

我也尝试过:

{
  "query" : {
   "filtered" : {
        "filter" : {
             "regexp" : {
                   "lastname" : "smit*"
             }
         }
    }
  }
}

任何帮助将不胜感激.

推荐答案

因此,首先,这很大程度上取决于您对字段进行索引的方式-是否已分析,使用哪种标记器,是否将其小写,等等.

So first off, a lot of this is dependent on how you indexed the field - analyzed or not, what kind of tokenizer, was it lowercased, etc.

要回答有关正则表达式查询的特定问题,假设您的字段索引为"smith"(所有小写字母),则应将搜索字符串更改为"smit.*",该字符串应与"smith"匹配.史密斯."应该也可以.

To answer your specific question concerning regexp queries, assuming your field is indexed as "smith" (all lower case) you should change your search string to "smit.*" which should match "smith". "smit." should also work.

原因是在regexp中(与通配符不同).匹配任何字符."*"匹配任意数量的前一个字符.因此您的搜索将匹配"smitt"或"smittt".构造.*"表示匹配上一个字符的任何数字(包括0),即.".匹配任何.两者的组合相当于通配符"*"的正则表达式.

The reason is that in regexp (which is different than wildcard) "." matches any character. "*" matches any number of the previous character. So your search would match "smitt" or "smittt". The construct ".*" means match any number (including 0) of the previous character - which is "." which matches any. The combination of the two is the regexp equivalent of the wildcard "*".

也就是说,我要提醒您,正则表达式和通配符搜索在文本索引中可能会带来很大的性能挑战,具体取决于字段的性质,如何将其编入索引以及文档的数量.这些类型的搜索可能非常有用,但是有一个以上的人已经对小型数据集进行了通配符或正则表达式搜索,这些测试仅使生产性能令人失望.谨慎使用.

That said, I'd caution that regexp and wildcard searches can have significant performance challenges in text indexes, depending upon the nature of the field, how it's indexed and the number of documents. These kinds of searches can be very useful but more than one person has built wildcard or regexp searches tested on small data sets only to be disappointed by the production performance. Use with caution.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html

ElasticSearch正则表达式过滤器

这篇关于Elasticsearch正则表达式查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆