Google Cloud Dataproc无法使用初始化脚本创建新集群 [英] Google cloud dataproc failing to create new cluster with initialization scripts

查看:212
本文介绍了Google Cloud Dataproc无法使用初始化脚本创建新集群的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用以下命令创建数据处理集群:

I am using the below command to create data proc cluster:

gcloud dataproc集群创建notifyetis-dev --initialization-actions"gs://dataproc-initialization-actions/jupyter/jupyter.sh,gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh,gs://dataproc-initialization-actions/hue/hue.sh,gs://dataproc-initialization-actions/ipython-notebook/ipython.sh,gs://dataproc-initialization-actions/tez/tez.sh,gs://dataproc-initialization-actions/oozie/oozie.sh,gs://dataproc-initialization-actions/zeppelin/zeppelin.sh,gs://dataproc-initialization-actions/user-environment/user-environment.sh,gs: //dataproc-initialization-actions/list-consistency-cache/shared-list-consistency-cache.sh,gs://dataproc-initialization-actions/kafka/kafka.sh,gs://dataproc-initialization-actions/ganglia/ganglia.sh,gs://dataproc-initialization-actions/flink/flink.sh" --image-version 1.1 --master-boot-disk-size 100GB --master-machine-type n1-standard-1 --metadata"hive-metastore-instance = g-test-1022:asia-east1:db_instance" --num-preemptible-workers 2 --num-workers 2 --preemptible-worker-启动磁盘大小1TB-属性hive:hive.metastore.warehouse.dir = gs://informetis-dev/hive-warehouse --worker-machine-type n1-standard-2 --zone asia-east1-b --bucket info-dev

gcloud dataproc clusters create informetis-dev --initialization-actions "gs://dataproc-initialization-actions/jupyter/jupyter.sh,gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh,gs://dataproc-initialization-actions/hue/hue.sh,gs://dataproc-initialization-actions/ipython-notebook/ipython.sh,gs://dataproc-initialization-actions/tez/tez.sh,gs://dataproc-initialization-actions/oozie/oozie.sh,gs://dataproc-initialization-actions/zeppelin/zeppelin.sh,gs://dataproc-initialization-actions/user-environment/user-environment.sh,gs://dataproc-initialization-actions/list-consistency-cache/shared-list-consistency-cache.sh,gs://dataproc-initialization-actions/kafka/kafka.sh,gs://dataproc-initialization-actions/ganglia/ganglia.sh,gs://dataproc-initialization-actions/flink/flink.sh" --image-version 1.1 --master-boot-disk-size 100GB --master-machine-type n1-standard-1 --metadata "hive-metastore-instance=g-test-1022:asia-east1:db_instance" --num-preemptible-workers 2 --num-workers 2 --preemptible-worker-boot-disk-size 1TB --properties hive:hive.metastore.warehouse.dir=gs://informetis-dev/hive-warehouse --worker-machine-type n1-standard-2 --zone asia-east1-b --bucket info-dev

但是Dataproc无法创建群集,并在故障文件中出现以下错误:

But Dataproc failed to create cluster with following errors in failure file:

猫 + mysql -u hive -phive-password -e''错误2003(HY000):无法连接到'localhost'上的MySQL服务器(111) + mysql -e'创建用户'\'hive'\'由'\'hive-password'\''标识;'错误2003(HY000):无法连接到MySQL 本地主机(111)上的服务器

cat + mysql -u hive -phive-password -e '' ERROR 2003 (HY000): Can't connect to MySQL server on 'localhost' (111) + mysql -e 'CREATE USER '\''hive'\'' IDENTIFIED BY '\''hive-password'\'';' ERROR 2003 (HY000): Can't connect to MySQL server on 'localhost' (111)

有人对此失败有任何想法吗?

Does anyone have any idea behind this failure ?

推荐答案

您似乎缺少了--scopes sql-admin标志,如

It looks like you're missing the --scopes sql-admin flag as described in the initialization action's documentation, which will prevent the CloudSQL proxy from being able to authorize its tunnel into your CloudSQL instance.

此外,除了范围之外,您还需要确保

Additionally, aside from just the scopes, you need to make sure the default Compute Engine service account has the right project-level permissions in whichever project holds your CloudSQL instance. Normally the default service account is a project editor in the GCE project, so that should be sufficient when combined with the sql-admin scopes to access a CloudSQL instance in the same project, but if you're accessing a CloudSQL instance in a separate project, you'll also have to add that service account as a project editor in the project which owns the CloudSQL instance.

您可以在 IAM页面下找到默认计算服务帐户的电子邮件地址. 用于部署Dataproc集群的项目,名称为"Compute Engine默认服务帐户";它应该看起来像<number> @ project.gserviceaccount.com`.

You can find the email address of your default compute service account under the IAM page for your project deploying Dataproc clusters, with the name "Compute Engine default service account"; it should look something like <number>@project.gserviceaccount.com`.

这篇关于Google Cloud Dataproc无法使用初始化脚本创建新集群的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆