如何在EMR中设置自定义环境变量以供Spark应用程序使用 [英] How to set a custom environment variable in EMR to be available for a spark Application
问题描述
在运行Spark应用程序时,我需要在EMR中设置一个自定义环境变量。
我已经尝试加入:
...
--configurations'[
{
Classification:spark-env,
Configurations:[
{
Classification:export ,
Configurations:[],
Properties:{SOME-ENV-VAR:qa1}
}
],
属性:{}
}
]'
...
<并且试图用 hadoop-env来替换spark-env code>
,但似乎没什么用。
有这是来自aws论坛的回答。但我无法弄清楚如何应用它。
我在EMR 5.3.1上运行,并使用来自cli的预配置步骤启动它: aws emr create-cluster ...
[
{
Classification:spark-env,
Properties: {},
配置:[
{
分类:导出,
属性:{
VARIABLE_NAME:VARIABLE_VALUE,
$ b code
然后,在创建emr集群时,将文件引用传递给 - 配置
选项
aws emr create-cluster --configurations file://custom_config.json --other-options ...
I need to set a custom environment variable in EMR to be available when running a spark application.
I have tried adding this:
...
--configurations '[
{
"Classification": "spark-env",
"Configurations": [
{
"Classification": "export",
"Configurations": [],
"Properties": { "SOME-ENV-VAR": "qa1" }
}
],
"Properties": {}
}
]'
...
and also tried to replace "spark-env with hadoop-env
but nothing seems to work.
There is this answer from the aws forums. but I can't figure out how to apply it.
I'm running on EMR 5.3.1 and launch it with a preconfigured step from the cli: aws emr create-cluster...
解决方案 Add the custom configurations like below JSON to a file say, custom_config.json
[
{
"Classification": "spark-env",
"Properties": {},
"Configurations": [
{
"Classification": "export",
"Properties": {
"VARIABLE_NAME": VARIABLE_VALUE,
}
}
]
}
]
And, On creating the emr cluster, pass the file reference to the --configurations
option
aws emr create-cluster --configurations file://custom_config.json --other-options...
这篇关于如何在EMR中设置自定义环境变量以供Spark应用程序使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!