将字符串的默认映射更改为“未分析”在弹性搜索 [英] Change default mapping of string to "not analyzed" in Elasticsearch

查看:250
本文介绍了将字符串的默认映射更改为“未分析”在弹性搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的系统中,数据的插入总是通过logstash通过csv文件完成的。我从来没有预先定义映射。但是每当我输入一个字符串,它始终被认为是分析,结果是一个条目,如 hello我是Sinha 被分成 hello I am ,code>辛哈。有没有办法我可以改变弹性搜索的默认/动态映射,所以所有字符串,无论索引,不管类型是否被认为是不分析?还是有一种在 .conf 文件中设置它的方法?说我的 conf 文件看起来像

In my system, the insertion of data is always done through csv files via logstash. I never pre-define the mapping. But whenever I input a string it is always taken to be analyzed, as a result an entry like hello I am Sinha is split into hello,I,am,Sinha. Is there anyway I could change the default/dynamic mapping of elasticsearch so that all strings, irrespective of index, irrespective of type are taken to be not analyzed? Or is there a way of setting it in the .conf file? Say my conf file looks like

input {  
      file {
          path => "/home/sagnik/work/logstash-1.4.2/bin/promosms_dec15.csv"
          type => "promosms_dec15"
          start_position => "beginning"
          sincedb_path => "/dev/null"
      }
}
filter {

    csv {
        columns => ["Comm_Plan","Queue_Booking","Order_Reference","Multi_Ordertype"]
        separator => ","
    }  
    ruby {
          code => "event['Generation_Date'] = Date.parse(event['Generation_Date']);"
    }

}
output {  
    elasticsearch { 
        action => "index"
        host => "localhost"
        index => "promosms-%{+dd.MM.YYYY}"
        workers => 1
    }
}

我希望所有的字符串都是未分析,我不介意这是将来所有未来数据插入弹性搜索的默认设置。

I want all the strings to be not analyzed and I don't mind it being the default setting for all future data to be inserted into elasticsearch either

推荐答案

您可以查询您的字段的 .raw 版本。这是在 Logstash 1.3.1 中添加的:

You can query the .raw version of your field. This was added in Logstash 1.3.1:


我们提供的logstash索引模板为您索引的每个字段添加.raw字段。这些.raw字段由logstash设置为not_analyzed,以便不进行分析或标记化 - 我们的原始值按原样使用。

The logstash index template we provide adds a ".raw" field to every field you index. These ".raw" fields are set by logstash as "not_analyzed" so that no analysis or tokenization takes place – our original value is used as-is!

所以如果你的字段叫 foo ,你可以查询 foo.raw 返回 not_analyzed (不分隔符)版本。

So if your field is called foo, you'd query foo.raw to return the not_analyzed (not split on delimiters) version.

这篇关于将字符串的默认映射更改为“未分析”在弹性搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆