弹性搜索限制结果 [英] Elastic Search limit results

查看:121
本文介绍了弹性搜索限制结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  SELECT id FROM table WHERE field ='foo'LIMIT 5 

如果表有10,000行,那么这个查询的方式比我省略LIMIT部分更快。 / p>

在ElasticSearch中,我有以下内容:

  {
query:{
fuzzy_like_this_field:{
body:{
like_text:REALLY LONG(snip)TEXT HERE,
max_query_terms :1,
min_similarity:0.95,
ignore_tf:true
}
}
}
}

当我运行这个搜索时,需要几秒钟,而mysql可以在远远少得多的时间内为相同的查询返回结果。



如果我传入size参数(设置为1),它只能返回1个结果,但查询本身不会比我设置尺寸到无限制返回所有的结果。我怀疑查询是完整的运行,在查询完成处理后只有1个结果被返回。这意味着size属性对于我的目的是无用的。



有没有办法让搜索停止搜索,只要找到与模糊匹配的单个记录搜索,而不是在返回响应之前处理索引中的每个记录?我误解了一些更根本的事情吗?



提前感谢

解决方案

您正确地查询正在完全运行。查询默认情况下按分数排序返回数据,因此您的查询将对每个文档进行分数。该文档规定模糊查询不会为了缩小,所以可能要考虑其他查询。



A 限制过滤器可能会给您类似的行为与您寻找的。


限制过滤器限制在
上执行
的文档数(每个分片) / blockquote>

要复制mysql field ='foo'尝试使用术语过滤器。当您不关心得分时,您应该使用过滤器,它们的速度更快,可缓存。


In MySQL I can do something like:

  SELECT id FROM table WHERE field = 'foo' LIMIT 5

If the table has 10,000 rows, then this query is way way faster than if I left out the LIMIT part.

In ElasticSearch, I've got the following:

 {
    "query":{
       "fuzzy_like_this_field":{
          "body":{
             "like_text":"REALLY LONG (snip) TEXT HERE",
             "max_query_terms":1,
             "min_similarity":0.95,
             "ignore_tf":true
          }
       }
    }
 }

When I run this search, it takes a few seconds, whereas mysql can return results for the same query in far, far less time.

If I pass in the size parameter (set to 1), it successfully only returns 1 result, but the query itself isn't any faster than if I had set the size to unlimited and returned all the results. I suspect the query is being run in its entirety and only 1 result is being returned after the query is done processing. This means the "size" attribute is useless for my purposes.

Is there any way to have my search stop searching as soon as it finds a single record that matches the fuzzy search, rather than processing every record in the index before returning a response? Am I misunderstanding something more fundamental about this?

Thanks in advance.

解决方案

You are correct the query is being ran entirely. Queries by default return data sorted by score, so your query is going to score each document. The docs state that the fuzzy query isn't going to scale well, so might want to consider other queries.

A limit filter might give you similar behavior to what your looking for.

A limit filter limits the number of documents (per shard) to execute on

To replicate mysql field='foo' try using a term filter. You should use filters when you don't care about scoring, they are faster and cache-able.

这篇关于弹性搜索限制结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆