使用EMRFS(s3存储桶)创建EMR 5.3.0作为存储 [英] Create EMR 5.3.0 with EMRFS (s3 bucket) as storage

查看:220
本文介绍了使用EMRFS(s3存储桶)创建EMR 5.3.0作为存储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用EMRFS(S3存储桶)作为存储来创建EMR 5.3.0. 请为此提供一般指导.

I'm trying to create EMR 5.3.0 with EMRFS (S3 bucket) as storage. Please provide your general guidance regarding this.

当前我正在使用以下命令创建InstanceType = m4.2xlarge的EMR 5.3.0.虽然工作正常,但是使用EMRFS作为存储却无法做到

Currently i'm using below command for creating EMR 5.3.0 with InstanceType=m4.2xlarge.Which is working fine, but with EMRFS as storage i'm not able to do

aws emr create-cluster --name "DEMAPAUR001" --release-label emr-5.3.0 --service-role EMR_DefaultRole_Private --enable-debug --log-uri 's3n://xyz/trn' --ec2-attributes SubnetId=subnet-545e8823, KeyName=XXX --applications Name=Hbase Name=Hive Name=Pig Name=Ganglia --configurations '[{"Classification":"hdfs-site","Properties": {"dfs.replication":"2"},"Configurations":[]}]' --instance-groups
'InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.2xlarge, EbsConfiguration={EbsOptimized=true, EbsBlockDeviceConfigs=[{VolumeSpecification= {VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}' 'InstanceGroupType=CORE, InstanceCount=1,InstanceType=m4.2xlarge,EbsConfiguration={EbsOptimized=true, EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}' --tags Name=DEMAPAUR001 Owner="XXX" Division=Corporate Application=DEM-EMR Environment=TRN CostCenter=XXX123 CreatedBy=XXX ManagedBy=XXX Availability=24x7_Mon-Fri Backup=NA

aws emr create-cluster --name "DEMAPAUR001" --release-label emr-5.3.0 --service-role EMR_DefaultRole_Private --enable-debug --log-uri 's3n://xyz/trn' --ec2-attributes SubnetId=subnet-545e8823, KeyName=XXX --applications Name=Hbase Name=Hive Name=Pig Name=Ganglia --configurations '[{"Classification":"hdfs-site","Properties": {"dfs.replication":"2"},"Configurations":[]}]' --instance-groups
'InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.2xlarge, EbsConfiguration={EbsOptimized=true, EbsBlockDeviceConfigs=[{VolumeSpecification= {VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}' 'InstanceGroupType=CORE, InstanceCount=1,InstanceType=m4.2xlarge,EbsConfiguration={EbsOptimized=true, EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}' --tags Name=DEMAPAUR001 Owner="XXX" Division=Corporate Application=DEM-EMR Environment=TRN CostCenter=XXX123 CreatedBy=XXX ManagedBy=XXX Availability=24x7_Mon-Fri Backup=NA

请帮助我.

推荐答案

在启动集群时,可以在配置中使用以下分类.

You can use the following classification in the configuration while launching the cluster.

用于启用一致视图

{ 分类":"emrfs-site", 特性": { "fs.s3.consistent":"true" } }

{ "Classification": "emrfs-site", "Properties": { "fs.s3.consistent": "true" } }

此外,如果您实际上想让hive指向S3并将所有新文件存储在那里,则必须将此分类添加到hive-site.xml

Also, if you actually want hive to point to S3 and store all new files there, you will have to add this classification to hive-site.xml

{ 分类":蜂巢站点", 特性": { "hive.metastore.warehouse.dir":self.hive_warehouse_dir } }

{ "Classification": "hive-site", "Properties": { "hive.metastore.warehouse.dir": self.hive_warehouse_dir } }

这篇关于使用EMRFS(s3存储桶)创建EMR 5.3.0作为存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆