使用引导程序替换EMR上的默认jar [英] Use bootstrap to replace default jar on EMR

查看:304
本文介绍了使用引导程序替换EMR上的默认jar的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用AMI 3.0.4的EMR群集上。

  cd / home / hadoop / share / hadoop / common / lib / 
rm guava-11.0.2.jar
wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1 .jar
chmod 777 guava-14.0.1.jar

有可能做到以上在引导行动?感谢!

解决方案

使用EMR 4.0,hadoop安装路径发生了变化。因此,手动更新guava-14.0.1.jar必须更改为:

  cd / usr / lib / hadoop / lib 
sudo wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
sudo rm guava-11.0.2.jar

来自Sandesh的答案中的boostrap Action对我们不起作用。



编辑:



现在我们得到了EMR 4.0的解决方案。你必须在S3中提供一个spark-config.json,为Spark Executor和Driver设置额外的ClassPath。在编辑软件设置(可选)部分中,您可以定义此配置文件的位置并从S3中加载它。

spark-config.json

  [
{
classification:spark,
属性:{
maximizeResourceAllocation:true
}
},
{
classification:spark-defaults,
属性:{
spark.executor.extraClassPath:/ home / hadoop / lib / guava-14.0.1.jar,
spark.driver.extraClassPath:/ home / hadoop /lib/guava-14.0.1.jar,
}
}
]

需要通过boostrap脚本下载guava-14.0.1.jar:
guava_download.sh

 #!/ bin / bash 
mkdir -p / home / hadoop / lib /
cd / home / hadoop / lib /
wget https: //repo1.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar


I am on a EMR cluster with AMI 3.0.4. Once the cluster is up, I ssh to master and did the following manually:

cd /home/hadoop/share/hadoop/common/lib/
rm guava-11.0.2.jar
wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
chmod 777 guava-14.0.1.jar

Is it possible to do above in a bootstrap action? Thanks!

解决方案

With EMR 4.0 the hadoop installation path changed. So the manual update of guava-14.0.1.jar must be changed to:

cd /usr/lib/hadoop/lib
sudo wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
sudo rm guava-11.0.2.jar

The boostrap Action in the Answer from Sandesh doesn't work for us.

Edit:

Now we got a solution for EMR 4.0. You have to provide a spark-config.json in S3 which sets the extra ClassPath for both the Spark Executor and Driver. In the "Edit software settings (optional)" section you can define the location of this config file and load it from S3.

spark-config.json

[
  {
  "classification":"spark",
  "properties":{
    "maximizeResourceAllocation":"true"
    }
  },
  {
  "classification":"spark-defaults",
  "properties":{
    "spark.executor.extraClassPath":"/home/hadoop/lib/guava-14.0.1.jar",
    "spark.driver.extraClassPath":"/home/hadoop/lib/guava-14.0.1.jar",
    }
  }
]

The guava-14.0.1.jar needs to be downloaded via the boostrap script: guava_download.sh

#!/bin/bash
mkdir -p /home/hadoop/lib/
cd /home/hadoop/lib/
wget https://repo1.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar

这篇关于使用引导程序替换EMR上的默认jar的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆