使用logstash将CSV地理数据转换为elasticsearch作为geo_point类型 [英] CSV geodata into elasticsearch as a geo_point type using logstash

查看:1132
本文介绍了使用logstash将CSV地理数据转换为elasticsearch作为geo_point类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是我使用最新版本的logstash和弹性搜索的问题的一个可重复的示例。



我正在使用logstash从csv输入地理空间数据进入弹性搜索作为geo_points。



CSV如下所示:

  $ head simple_base_map.csv 
lon,lat
-1.7841,50.7408
-1.7841,50.7408
-1.78411,50.7408
-1.78412,50.7408
-1.78413,50.7408
-1.78414,50.7408
-1.78415,50.7408
-1.78416,50.7408
-1.78416,50.7408

我已经创建了一个如下所示的映射模板:

  $ cat simple_base_map_template.json 
{
template:base_map_template,
order:1,
settings:{
number_of_shards:1
},

mappings:{
node_points:{
properties:{
location type:geo_point}
}
}
}
}

并有一个logstash配置文件如下:

  $ cat simple_base_map.conf 
input {
stdin {}
}

过滤器{
csv {
columns => [
lon,lat
]
}

如果[lon] ==lon{
drop {}
} else {
mutate {
remove_field => [message,host,@timestamp,@version]
}
mutate {
convert => {lon=> float}
convert => {lat=> float}
}

mutate {
rename => {
lon=> [location] [lon]
lat=> [location] [lat]
}
}
}
}

输出{
stdout {codec =>点}
elasticsearch {
index => base_map_simple
template => simple_base_map_template.json
document_type => node_points
}
}

然后运行以下命令: p>

  $ cat simple_base_map.csv | logstash-2.1.3 / bin / logstash -f simple_base_map.conf 
设置:默认过滤器工作人员:16
Logstash启动完成
.............. .................................................. .................................... Logstash关闭完成
但是,当查看索引base_map_simple时,它建议文档不会有一个位置:geo_point类型,而不是两倍。$ /

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ bbase_map_simple:{
aliases:{},
mappings:{
node_points:{
properties:{
location :{
properties:{
lat:{
type:double
},
lon:{
type:double
}
}
}
}
}
},
settings:{
index:{
creation_date:1457355015883,
uuid:luWGyfB3ToKTObSrbBbcbw b $ bnumber_of_replicas:1,
number_of_shards:5,
version:{
created:2020099
}
}
},
warmers:{}
}
}

我需要如何更改以上任何文件,以确保将其作为geo_point类型进行弹性搜索?



最后,我希望能够通过使用以下命令在geo_points上执行最近邻居搜索:

  curl  - XGET'localhost:9200 / base_map_simple / _search?pretty'-d'
{
size:1,
sort:{
_geo_distance:{
location:{
lat:50,
lon:-1
},
order:asc,
单位:m
}
}
}'

谢谢

解决方案

问题在于您的弹性搜索输出命名为 base_map_simple 而在您的模板中模板属性为 base_map_template ,因此在创建新索引时不会应用该模板。 模板属性需要以某种方式匹配创建的索引的名称,以使模板踢。



如果您只是更改后者到 base_map _ * ,即如:

  {
template:base_map_ *,< ---更改此
订单:1,
设置:{
index.number_of_shards:1

mappings:{
node_points:{
properties:{
location:{
type:geo_point
}
}
}
}
}

更新



确保先删除当前索引以及模板,即

  curl -XDELETE localhost:9200 / base_map_simple 
curl -XDELETE localhost:9200 / _template / logstash


Below is a reproducible example of the problem I am having using to most recent versions of logstash and elasticsearch.

I am using logstash to input geospatial data from a csv into elasticsearch as geo_points.

The CSV looks like the following:

$ head simple_base_map.csv 
"lon","lat"
-1.7841,50.7408
-1.7841,50.7408
-1.78411,50.7408
-1.78412,50.7408
-1.78413,50.7408
-1.78414,50.7408
-1.78415,50.7408
-1.78416,50.7408
-1.78416,50.7408

I have create a mapping template that looks like the following:

$ cat simple_base_map_template.json 
{
  "template": "base_map_template",
  "order":    1,
  "settings": {
    "number_of_shards": 1
  },

      "mappings": {
        "node_points" : {
          "properties" : {
            "location" : { "type" : "geo_point" }
          }
        }
      }
}

and have a logstash config file that looks like the following:

$ cat simple_base_map.conf 
input {
  stdin {}
}

filter {
  csv {
      columns => [
        "lon", "lat"
      ]
  }

  if [lon] == "lon" {
      drop { }
  } else {
      mutate {
          remove_field => [ "message", "host", "@timestamp", "@version"     ]
      }
       mutate {
          convert => { "lon" => "float" }
          convert => { "lat" => "float" }
          }

      mutate {
          rename => {
              "lon" => "[location][lon]"
              "lat" => "[location][lat]"
          }
      }
  }
}

output {
  stdout { codec => dots }
  elasticsearch {
      index => "base_map_simple"
      template => "simple_base_map_template.json"
      document_type => "node_points"
  }
}

I then run the following:

$cat simple_base_map.csv | logstash-2.1.3/bin/logstash -f simple_base_map.conf 
Settings: Default filter workers: 16
Logstash startup completed
....................................................................................................Logstash shutdown completed

However when looking at the index base_map_simple, it suggests the documents would not have a location: geo_point type in it...and rather it would be two doubles of lat and lon.

$ curl -XGET 'localhost:9200/base_map_simple?pretty'
{
  "base_map_simple" : {
    "aliases" : { },
    "mappings" : {
      "node_points" : {
        "properties" : {
          "location" : {
            "properties" : {
              "lat" : {
                "type" : "double"
              },
              "lon" : {
                "type" : "double"
              }
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1457355015883",
        "uuid" : "luWGyfB3ToKTObSrbBbcbw",
        "number_of_replicas" : "1",
        "number_of_shards" : "5",
        "version" : {
          "created" : "2020099"
        }
      }
    },
    "warmers" : { }
  }
}

How would i need to change any of the above files to ensure that it goes into elastic search as a geo_point type?

Finally, I would like to be able to carry out a nearest neighbour search on the geo_points by using a command such as the following:

curl -XGET 'localhost:9200/base_map_simple/_search?pretty' -d'
{
    "size": 1,
    "sort": {
   "_geo_distance" : {
       "location" : {
            "lat" : 50,
            "lon" : -1
        },
        "order" : "asc",
        "unit": "m"
   } 
    }
}'

Thanks

解决方案

The problem is that in your elasticsearch output you named the index base_map_simple while in your template the template property is base_map_template, hence the template is not being applied when creating the new index. The template property needs to somehow match the name of the index being created in order for the template to kick in.

It will work if you simply change the latter to base_map_*, i.e. as in:

{
  "template": "base_map_*",             <--- change this
  "order": 1,
  "settings": {
    "index.number_of_shards": 1
  },
  "mappings": {
    "node_points": {
      "properties": {
        "location": {
          "type": "geo_point"
        }
      }
    }
  }
}

UPDATE

Make sure to delete the current index as well as the template first., i.e.

curl -XDELETE localhost:9200/base_map_simple
curl -XDELETE localhost:9200/_template/logstash

这篇关于使用logstash将CSV地理数据转换为elasticsearch作为geo_point类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆