弹性搜索 - 使用路径层次化标记器访问不同级别的类别 [英] Elasticsearch - using the path hierarchy tokenizer to access different level of categories

查看：140 发布时间：2017/8/7 0:12:22 path elasticsearch hierarchy categories

本文介绍了弹性搜索 - 使用路径层次化标记器访问不同级别的类别的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我非常喜欢Elasticsearch，并且有一个关于路径的层次化标记器的问题。这是我的代码示例：

我的映射代码：

  PUT / my_index 
 {
settings：{
analysis：{
analyzer：{
path-analyzer：{
键入：custom，
tokenizer：path-tokenizer
} 
}，
tokenizer：{
path-tokenizer {
type：path_hierarchy，
delimiter：。 
} 
} 
} 
}，
mappings：{
my_type：{
dynamic：strict ，
properties：{
group_path：{
type：string，
index_analyzer：path-analyzer，
 search_analyzer：关键字
} 
} 
} 
} 
}

这是我的PUT：

  PUT / my_index / my_type / 1 
 {
group_path：[Book.Thriller.Adult，DVD.Comedy.Kids] 
}

这是我的查询：

  GET / my_index / my_type / _search？search_type = count 
 {
aggs：{
category：{
terms：{
field：group_path，
 ：0 
} 
} 
} 
}

结果：

  {
 ... 
聚合：{
category：{
buckets：[
 {
key：Book，
doc_count：1 
}，
 {
key ：Book.Thriller，
doc_count：1 
}，
 {
key：Book.Thriller.Adult，
doc_count ：1 
}，
 {
key：DVD，
doc_count：1 
}，
 {
关键：DVD.Comedy，
doc_count：1 
}，
 {
key：DVD.Comedy.Kids，
 doc_count：1 
} 
] 
} 
} 
}

到目前为止，一切都很好。我正在寻找的是，如何创建桶，例如仅用于第一类。如何获得这样的结果：

  {
 ... 
聚合：{
category：{
buckets：[
 {
key：Book，
doc_count：1 
}，
 {
key：DVD，
doc_count：1 
} 
] 
} 
} 
}

感谢您的帮助。

解决方案

我发现这样做的唯一方法是使用排除语法来排除不需要的级别。

  {
aggs：{
category：{
 ：{
field：group_path，
size：0，
exclude：。* \\ .. *
} 
} 
} 
}

然后返回

 聚合：{
 category：{
 buckets：[
 {
 key：Book 
 doc_count：1 
} 
 {
键：DVD 
 doc_count：1 
} 
] 
} 
}

如果您选择书籍，则可以这样搜索

$ $ $ $ $ $ $ $$ {
{
过滤：{
过滤器：{

group_path：Book
}
}
}
}，
aggs：{
category：{
条款：{
field：group_path，
size：0，
include：Book\\ .. *，
exclude ：。* \\ .. * \\ .. *
}
}
}
}

然后返回

 聚合：{
 category：{
 buckets：[
 {
 key：Book.Thriller 
 doc_count：1 
} 
] 
} 
}

I'm very new in Elasticsearch and have a question about the hierarchical tokenizer of a path. Here is my code example:

My mapping code:

PUT /my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "path-analyzer": {
          "type": "custom",
          "tokenizer": "path-tokenizer"
        }
      },
      "tokenizer": {
        "path-tokenizer": {
          "type": "path_hierarchy",
          "delimiter": "."
        }
      }
    }
  },
  "mappings": {
    "my_type": {
      "dynamic": "strict",
      "properties": {
        "group_path": {
          "type": "string",
          "index_analyzer": "path-analyzer",
          "search_analyzer": "keyword"
        }
      }
    }
  }
}

This is my PUT:

PUT /my_index/my_type/1
{
  "group_path": ["Book.Thriller.Adult","DVD.Comedy.Kids"]
}

This is my Query:

GET /my_index/my_type/_search?search_type=count
{
   "aggs": {
      "category": {
         "terms": {
            "field": "group_path",
            "size": 0
         }
      }
   }
}

And the result:

{
   ...
   "aggregations": {
      "category": {
         "buckets": [
            {
               "key": "Book",
               "doc_count": 1
            },
            {
               "key": "Book.Thriller",
               "doc_count": 1
            },
            {
               "key": "Book.Thriller.Adult",
               "doc_count": 1
            },
            {
               "key": "DVD",
               "doc_count": 1
            },
            {
               "key": "DVD.Comedy",
               "doc_count": 1
            },
            {
               "key": "DVD.Comedy.Kids",
               "doc_count": 1
            }
         ]
      }
   }
}

So far is everything good. What I'm looking for is that how can I create buckets for example only for the first category. How can I get result like that:

{
   ...
   "aggregations": {
      "category": {
         "buckets": [
            {
               "key": "Book",
               "doc_count": 1
            },
            {
               "key": "DVD",
               "doc_count": 1
            }
         ]
      }
   }
}

Thank you for any help.

解决方案

The only way I found to do this is to use the exclude syntax to exclude the levels you don't want.

    {
   "aggs": {
      "category": {
         "terms": {
            "field": "group_path",
            "size": 0, 
            "exclude" : ".*\\..*"
         }
      }
   }
}

Will then return

aggregations: {
     category: {
       buckets: [
          {
             key: Book
             doc_count: 1
          }
          {
             key: DVD
            doc_count: 1
          }
       ]
     }
}

If you select book, you can then search like this

{
    "query" : {
        "filtered": {
            "filter": {
        "prefix": {
          "group_path": "Book"
        }
            }
        }
    },
    "aggs" : {
      "category": {
        "terms": {
          "field": "group_path",
          "size": 0,
          "include" : "Book\\..*",
          "exclude": ".*\\..*\\..*"
        }
      }
    }
}

Will then return

aggregations: {
     category: {
       buckets: [
          {
             key: Book.Thriller
             doc_count: 1
          }
       ]
     }
}

这篇关于弹性搜索 - 使用路径层次化标记器访问不同级别的类别的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

弹性搜索 - 使用路径层次化标记器访问不同级别的类别 [英] Elasticsearch - using the path hierarchy tokenizer to access different level of categories

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

弹性搜索 - 使用路径层次化标记器访问不同级别的类别 [英] Elasticsearch - using the path hierarchy tokenizer to access different level of categories

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭