使用EMRFS(s3存储桶)创建EMR 5.3.0作为存储 [英] Create EMR 5.3.0 with EMRFS (s3 bucket) as storage
问题描述
我正在尝试使用EMRFS(S3存储桶)作为存储来创建EMR 5.3.0. 请为此提供一般指导.
I'm trying to create EMR 5.3.0 with EMRFS (S3 bucket) as storage. Please provide your general guidance regarding this.
当前我正在使用以下命令创建InstanceType = m4.2xlarge的EMR 5.3.0.虽然工作正常,但是使用EMRFS作为存储却无法做到
Currently i'm using below command for creating EMR 5.3.0 with InstanceType=m4.2xlarge.Which is working fine, but with EMRFS as storage i'm not able to do
aws emr create-cluster --name "DEMAPAUR001"
--release-label emr-5.3.0
--service-role EMR_DefaultRole_Private
--enable-debug
--log-uri 's3n://xyz/trn'
--ec2-attributes SubnetId=subnet-545e8823,
KeyName=XXX
--applications Name=Hbase Name=Hive Name=Pig Name=Ganglia
--configurations '[{"Classification":"hdfs-site","Properties":
{"dfs.replication":"2"},"Configurations":[]}]'
--instance-groups
'InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.2xlarge,
EbsConfiguration={EbsOptimized=true,
EbsBlockDeviceConfigs=[{VolumeSpecification= {VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}'
'InstanceGroupType=CORE,
InstanceCount=1,InstanceType=m4.2xlarge,EbsConfiguration={EbsOptimized=true,
EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}'
--tags Name=DEMAPAUR001 Owner="XXX" Division=Corporate Application=DEM-EMR Environment=TRN CostCenter=XXX123 CreatedBy=XXX ManagedBy=XXX Availability=24x7_Mon-Fri Backup=NA
aws emr create-cluster --name "DEMAPAUR001"
--release-label emr-5.3.0
--service-role EMR_DefaultRole_Private
--enable-debug
--log-uri 's3n://xyz/trn'
--ec2-attributes SubnetId=subnet-545e8823,
KeyName=XXX
--applications Name=Hbase Name=Hive Name=Pig Name=Ganglia
--configurations '[{"Classification":"hdfs-site","Properties":
{"dfs.replication":"2"},"Configurations":[]}]'
--instance-groups
'InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.2xlarge,
EbsConfiguration={EbsOptimized=true,
EbsBlockDeviceConfigs=[{VolumeSpecification= {VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}'
'InstanceGroupType=CORE,
InstanceCount=1,InstanceType=m4.2xlarge,EbsConfiguration={EbsOptimized=true,
EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}'
--tags Name=DEMAPAUR001 Owner="XXX" Division=Corporate Application=DEM-EMR Environment=TRN CostCenter=XXX123 CreatedBy=XXX ManagedBy=XXX Availability=24x7_Mon-Fri Backup=NA
请帮助我.
推荐答案
在启动集群时,可以在配置中使用以下分类.
You can use the following classification in the configuration while launching the cluster.
用于启用一致视图
{ 分类":"emrfs-site", 特性": { "fs.s3.consistent":"true" } }
{ "Classification": "emrfs-site", "Properties": { "fs.s3.consistent": "true" } }
此外,如果您实际上想让hive指向S3并将所有新文件存储在那里,则必须将此分类添加到hive-site.xml
Also, if you actually want hive to point to S3 and store all new files there, you will have to add this classification to hive-site.xml
{ 分类":蜂巢站点", 特性": { "hive.metastore.warehouse.dir":self.hive_warehouse_dir } }
{ "Classification": "hive-site", "Properties": { "hive.metastore.warehouse.dir": self.hive_warehouse_dir } }
这篇关于使用EMRFS(s3存储桶)创建EMR 5.3.0作为存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!