如何使用"-files"指定多个文件?在Amazon CLI中用于EMR? [英] How can multiple files be specified with "-files" in the CLI of Amazon for EMR?

查看:90
本文介绍了如何使用"-files"指定多个文件?在Amazon CLI中用于EMR?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过amazon CLI启动amazon集群,但是我有点困惑应该如何指定多个文件.我目前的电话如下:

I am trying to start an amazon cluster via the amazon CLI, but I am a little bit confused how I should specify multiple files. My current call is as follows:

aws emr create-cluster --steps Type=STREAMING,Name='Intra country development',ActionOnFailure=CONTINUE,Args=[-files,s3://betaestimationtest/mapper.py,-
files,s3://betaestimationtest/reducer.py,-mapper,mapper.py,-reducer,reducer.py,-
input,s3://betaestimationtest/output_0_inter,-output,s3://betaestimationtest/output_1_intra] 
--ami-version 3.1.0 
--instance-groupsInstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge 
InstanceGroupType=CORE,InstanceCount=2,InstanceType=m3.xlarge --auto-terminate 
--log-uri s3://betaestimationtest/logs

但是,Hadoop现在抱怨无法找到化简器文件:

However, Hadoop now complains that it cannot find the reducer file:

Caused by: java.io.IOException: Cannot run program "reducer.py": error=2, No such file or directory

我做错了什么?该文件确实存在于我指定的文件夹中

What am I doing wrong? The file does exist in the folder I specify

推荐答案

要在流传输步骤中传递多个文件,您需要使用file://作为json文件传递步骤.

For passing multiple files in a streaming step, you need to use file:// to pass the steps as a json file.

AWS CLI速记语法使用逗号作为分隔符来分隔args列表.因此,当我们尝试传递诸如"-files","s3://betaestimationtest/mapper.py、s3://betaestimationtest/reducer.py"之类的参数时,速记语法分析器将处理mapper.py和reducer. py文件作为两个参数.

AWS CLI shorthand syntax uses comma as delimeter to separate a list of args. So when we try to pass in parameters like: "-files","s3://betaestimationtest/mapper.py,s3://betaestimationtest/reducer.py", then the shorthand syntax parser will treat mapper.py and reducer.py files as two parameters.

解决方法是使用json格式.请参见下面的示例.

The workaround is to use the json format. Please see the examples below.

aws emr create-cluster --steps file://./mysteps.json --ami-version 3.1.0 --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge InstanceGroupType=CORE,InstanceCount=2,InstanceType=m3.xlarge --auto-terminate --log-uri s3://betaestimationtest/logs

mysteps.json如下:

mysteps.json looks like:

[
    {
    "Name": "Intra country development",
    "Type": "STREAMING",
    "ActionOnFailure": "CONTINUE",
    "Args": [
        "-files",
        "s3://betaestimationtest/mapper.py,s3://betaestimationtest/reducer.py",
        "-mapper",
        "mapper.py",
        "-reducer",
        "reducer.py",
        "-input",
        " s3://betaestimationtest/output_0_inte",
        "-output",
        " s3://betaestimationtest/output_1_intra"
    ]}
]

您还可以在此处找到示例:

You can also find examples here: https://github.com/aws/aws-cli/blob/develop/awscli/examples/emr/create-cluster-examples.rst. See example 13.

希望有帮助!

这篇关于如何使用"-files"指定多个文件?在Amazon CLI中用于EMR?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆