使用引导程序替换EMR上的默认jar [英] Use bootstrap to replace default jar on EMR
问题描述
我正在使用AMI 3.0.4的EMR群集上。
cd / home / hadoop / share / hadoop / common / lib /
rm guava-11.0.2.jar
wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1 .jar
chmod 777 guava-14.0.1.jar
有可能做到以上在引导行动?感谢!
使用EMR 4.0,hadoop安装路径发生了变化。因此,手动更新guava-14.0.1.jar必须更改为:
cd / usr / lib / hadoop / lib
sudo wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
sudo rm guava-11.0.2.jar
来自Sandesh的答案中的boostrap Action对我们不起作用。
编辑:
现在我们得到了EMR 4.0的解决方案。你必须在S3中提供一个spark-config.json,为Spark Executor和Driver设置额外的ClassPath。在编辑软件设置(可选)部分中,您可以定义此配置文件的位置并从S3中加载它。
spark-config.json
[
{
classification:spark,
属性:{
maximizeResourceAllocation:true
}
},
{
classification:spark-defaults,
属性:{
spark.executor.extraClassPath:/ home / hadoop / lib / guava-14.0.1.jar,
spark.driver.extraClassPath:/ home / hadoop /lib/guava-14.0.1.jar,
}
}
]
需要通过boostrap脚本下载guava-14.0.1.jar:
guava_download.sh
#!/ bin / bash
mkdir -p / home / hadoop / lib /
cd / home / hadoop / lib /
wget https: //repo1.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
I am on a EMR cluster with AMI 3.0.4. Once the cluster is up, I ssh to master and did the following manually:
cd /home/hadoop/share/hadoop/common/lib/
rm guava-11.0.2.jar
wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
chmod 777 guava-14.0.1.jar
Is it possible to do above in a bootstrap action? Thanks!
With EMR 4.0 the hadoop installation path changed. So the manual update of guava-14.0.1.jar must be changed to:
cd /usr/lib/hadoop/lib
sudo wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
sudo rm guava-11.0.2.jar
The boostrap Action in the Answer from Sandesh doesn't work for us.
Edit:
Now we got a solution for EMR 4.0. You have to provide a spark-config.json in S3 which sets the extra ClassPath for both the Spark Executor and Driver. In the "Edit software settings (optional)" section you can define the location of this config file and load it from S3.
spark-config.json
[
{
"classification":"spark",
"properties":{
"maximizeResourceAllocation":"true"
}
},
{
"classification":"spark-defaults",
"properties":{
"spark.executor.extraClassPath":"/home/hadoop/lib/guava-14.0.1.jar",
"spark.driver.extraClassPath":"/home/hadoop/lib/guava-14.0.1.jar",
}
}
]
The guava-14.0.1.jar needs to be downloaded via the boostrap script: guava_download.sh
#!/bin/bash
mkdir -p /home/hadoop/lib/
cd /home/hadoop/lib/
wget https://repo1.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
这篇关于使用引导程序替换EMR上的默认jar的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!