弹性搜索“pattern_replace”,在分析时替换空白 [英] Elasticsearch "pattern_replace", replacing whitespaces while analyzing

查看:221
本文介绍了弹性搜索“pattern_replace”,在分析时替换空白的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基本上我想删除所有的空格,并将整个字符串标记为单个标记。 (我将在之后使用nGram。)

Basically I want to remove all whitespaces and tokenize the whole string as a single token. (I will use nGram on top of that later on.)

这是我的索引设置:

"settings": {
 "index": {
  "analysis": {
    "filter": {
      "whitespace_remove": {
        "type": "pattern_replace",
        "pattern": " ",
        "replacement": ""
      }
    },
    "analyzer": {
      "meliuz_analyzer": {
        "filter": [
          "lowercase",
          "whitespace_remove"
        ],
        "type": "custom",
        "tokenizer": "standard"
      }
    }
  }
}

而不是pattern:,我试过pattern:\ \\\ \\s

但是我分析文本beleza na web,它仍然创建三个独立的令牌:beleza,na和web,而不是单个belezanaweb。

But when I analyze the text "beleza na web", it still creates three separate tokens: "beleza", "na" and "web", instead of one single "belezanaweb".

推荐答案

分析器首先对其进行标记化,然后应用一系列令牌过滤器来分析字符串。您已经将标记器指定为标准,意味着输入已使用标准进行标记化tokenizer 分别创建了令牌。然后将模式替换过滤器应用于令牌。

The analyzer analyzes a string by tokenizing it first then applying a series of token filters. You have specified tokenizer as standard means the input is already tokenized using standard tokenizer which created the tokens separately. Then pattern replace filter is applied to the tokens.

使用关键字tokenizer ,而不是标准的tokenizer。其余的映射是好的。
您可以如下更改映射

Use keyword tokenizer instead of your standard tokenizer. Rest of the mapping is fine. You can change your mapping as below

"settings": {
 "index": {
  "analysis": {
    "filter": {
      "whitespace_remove": {
        "type": "pattern_replace",
        "pattern": " ",
        "replacement": ""
      }
    },
    "analyzer": {
      "meliuz_analyzer": {
        "filter": [
          "lowercase",
          "whitespace_remove",
          "nGram"
        ],
        "type": "custom",
        "tokenizer": "keyword"
      }
    }
  }
}

这篇关于弹性搜索“pattern_replace”,在分析时替换空白的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆