在spark-submit中设置弹性搜索属性 [英] Setting elasticsearch properties in spark-submit

查看:585
本文介绍了在spark-submit中设置弹性搜索属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过命令行启动使用弹性搜索输入的Spark作业,使用spark-submit,如 http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/spark.html

I'm trying to launch Spark jobs that use Elastic Search input via command line using spark-submit as described in http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/spark.html

我将属性设置在一个文件中,但是当启动spark-submit时,会发出以下警告:

I'm setting the properties in a file, but when launching spark-submit it gives the following warnings:

~/spark-1.0.1-bin-hadoop1/bin/spark-submit --class Main --properties-file spark.conf SparkES.jar

Warning: Ignoring non-spark config property: es.resource=myresource
Warning: Ignoring non-spark config property: es.nodes=mynode
Warning: Ignoring non-spark config property: es.query=myquery
...
Exception in thread "main" org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed

我的配置文件看起来像(wi正确的值):

My config file looks like (with correct values):

es.nodes      nodeip:port
es.resource   index/type
es.query      query

在代码中设置配置对象中的属性,但是我需要避免这种解决方法。

Setting the properties in the Configuration object in the code works, but I need to avoid this workaround.

有没有办法通过命令行设置这些属性?

Is there a way to set those properties via command line?

推荐答案

我不知道你是否解决了问题(如果是这样,怎么办?),但是我发现这个解决方案:

I don't know if you resolved your issue (if so, how?), but I found this solution:

import org.elasticsearch.spark.rdd.EsSpark

EsSpark.saveToEs(rdd, "spark/docs", Map("es.nodes" -> "10.0.5.151"))

再见

这篇关于在spark-submit中设置弹性搜索属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆