Spark Job（Java）无法将数据写入Elasticsearch集群 [英] Spark job (Java) cannot write data to Elasticsearch cluster

查看：144 发布时间：2020/10/25 3:43:42 apache-spark elasticsearch docker-compose

本文介绍了Spark Job（Java）无法将数据写入Elasticsearch集群的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在Windows中使用 Docker 17.04.0-ce 和 Compose 1.12.0 来部署 Elasticsearch 集群（版本5.4。 0）通过Docker。
到目前为止，我已经执行了以下操作：

I am using Docker 17.04.0-ce and Compose 1.12.0 in Windows in order to deploy Elasticsearch cluster (version 5.4.0) over Docker. So far I have done the following:

1）我已经通过compose并使用以下配置创建了一个Elasticsearch节点

1) I have created a single Elasticsearch node via compose with the following configuration

  elasticsearch1:
    build: elasticsearch/
    container_name: es_1
    cap_add:
      - IPC_LOCK
    environment:
      - cluster.name=cp-es-cluster
      - node.name=cloud1
      - node.master=true
      - http.cors.enabled=true
      - http.cors.allow-origin="*"
      - bootstrap.memory_lock=true
      - discovery.zen.minimum_master_nodes=1
      - xpack.security.enabled=false
      - xpack.monitoring.enabled=false
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - esdata1:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
      - 9300:9300
    networks:
      docker_elk:
        aliases:
          - elasticsearch

此导致节点已部署，但无法从Spark访问。我将数据写为

This results in the node being deployed, but it is not accessible from Spark. I write data as

JavaEsSparkSQL.saveToEs(aggregators.toDF(), collectionName +"/record");

尽管节点正在运行，但出现以下错误

and I get the following error, although the node is running

I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect

2）我发现如果在节点配置中添加以下行，则可以解决此问题

2) I found out that this problem is solved if I add the following line in the node configuration

- network.publish_host=${ENV_IP}

3）然后我为另外2个节点创建了类似的配置

3) Then I created similar configurations for 2 additional nodes as

  elasticsearch1:
    build: elasticsearch/
    container_name: es_1
    cap_add:
      - IPC_LOCK
    environment:
      - cluster.name=cp-es-cluster
      - node.name=cloud1
      - node.master=true
      - http.cors.enabled=true
      - http.cors.allow-origin="*"
      - bootstrap.memory_lock=true
      - discovery.zen.minimum_master_nodes=1
      - xpack.security.enabled=false
      - xpack.monitoring.enabled=false
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
      - network.publish_host=${ENV_IP}
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - esdata1:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
      - 9300:9300
    networks:
      docker_elk:
        aliases:
          - elasticsearch

  elasticsearch2:
    build: elasticsearch/
    container_name: es_2
    cap_add:
      - IPC_LOCK
    environment:
      - cluster.name=cp-es-cluster
      - node.name=cloud2
      - http.cors.enabled=true
      - http.cors.allow-origin="*"
      - bootstrap.memory_lock=true
      - discovery.zen.minimum_master_nodes=2
      - xpack.security.enabled=false
      - xpack.monitoring.enabled=false
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - "discovery.zen.ping.unicast.hosts=elasticsearch1"
      - node.master=false
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - esdata2:/usr/share/elasticsearch/data
    ports:
      - 9201:9200
      - 9301:9300
    networks:
      - docker_elk

  elasticsearch3:
    build: elasticsearch/
    container_name: es_3
    cap_add:
      - IPC_LOCK
    environment:
      - cluster.name=cp-es-cluster
      - node.name=cloud3
      - http.cors.enabled=true
      - http.cors.allow-origin="*"
      - bootstrap.memory_lock=true
      - discovery.zen.minimum_master_nodes=2
      - xpack.security.enabled=false
      - xpack.monitoring.enabled=false
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - "discovery.zen.ping.unicast.hosts=elasticsearch1"
      - node.master=false
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - esdata3:/usr/share/elasticsearch/data
    ports:
      - 9202:9200
      - 9302:9300
    networks:
      - docker_elk

成功创建3个节点的群集。但是，Spark中再次出现相同的错误，并且数据无法写入集群。即使将 network.publish_host 添加到所有节点，我也会得到相同的行为。

This results in a cluster of 3 nodes being created successfully. However the same error reappeared in Spark and data cannot be written to the cluster. I get the same behavior even if I add network.publish_host to all nodes.

关于Spark，我使用 elasticsearch-spark-20_2.11 版本5.4.0（与ES版本相同）。有什么想法可以解决这个问题吗？

Regarding Spark, I use elasticsearch-spark-20_2.11 version 5.4.0 (same as ES version). Any ideas how to solve this issue?

Spark Job（Java）无法将数据写入Elasticsearch集群 [英] Spark job (Java) cannot write data to Elasticsearch cluster

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spark Job（Java）无法将数据写入Elasticsearch集群 [英] Spark job (Java) cannot write data to Elasticsearch cluster

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭