Elasticsearch非常长字符串的完全匹配性能 [英] Elasticsearch exact match performance for very long string

查看:83
本文介绍了Elasticsearch非常长字符串的完全匹配性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用例:

我需要从单个URL中提取信息,并将每条信息另存为单独的数据单元,以显示在不同的页面中.当用户访问页面中的数据单元时,我希望列出来自同一原始URL的所有其他数据单元.

I need to extract pieces of information from a single url and save each piece as separate data units to be shown in different pages. When a user visits a data unit in a page, I wish to list all other data units from the same original url.

我打算将原始url字段定义为not_analyzed字符串字段,然后使用完全匹配来获取从原始url中提取的所有内容.

I intend to define the original url field as a not_analyzed string field and then use exact match to get all the pieces extracted from the original url.

我的问题是:

原始网址可能很长.elasticsearch对非常长的字符串进行精确匹配的效率如何?Elasticsearch是否使用某种哈希算法(例如git's)进行长字符串精确匹配?

The original url could be very long. How efficient is elasticsearch to do exact matching for very long string? Does elasticsearch use some sort of hash algorithm such as git's for long string exact matching?

这个用例将被大量使用,因此对我来说很重要.

This usecase will be heavily used thus quite important for me to get an answer.

先谢谢了.

推荐答案

要匹配not_analyzed归档中的确切文档,您可以使用术语查询:

To match exact documents in a not_analyzed filed You can use a term query which will :

查找包含倒排中指定的确切术语的文档指数.

Find documents that contain the exact term specified in the inverted index.

例如:

POST _search
{
  "query": {
    "term" : { "url" : "google.com" } 
  }
}

我真的不能谈论表现.但是此查询将按原样匹配,并且不会对其进行任何转换,因为它将被 not_analyzed .

I can't really talk in terms of performance. But this query will match as it is , and it won't apply any transformation to the url as it will be not_analyzed.

这篇关于Elasticsearch非常长字符串的完全匹配性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆