如何使用Azure Python SDK设置Databricks服务? [英] How to use the Azure Python SDK to provision a Databricks service?

查看:62
本文介绍了如何使用Azure Python SDK设置Databricks服务?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

[以前,我曾问过如何在没有任何工作空间的情况下提供数据块服务.现在,我要问的是,在第一种情况下似乎不可行,如何为工作空间提供服务.]

[Previously in this post I asked how to provision a databricks services without any workspace. Now I'm asking how to provision a service with a workspace as the first scenario seems unfeasible.]

作为云管理员,我被要求使用Azure Python SDK编写脚本,该脚本将为我们的一个大数据开发团队提供Databricks服务.

As a cloud admin I'm asked to write a script using the Azure Python SDK which will provision a Databricks service for one of our big data dev teams.

除了

这些似乎在配置工作区方面提供了一些帮助,但是我还没有.

These appear to offer some help provisioning a workspace, but I am not quite there yet.

我想念什么?

感谢@Laurent Mazuel和@Jim Xu的帮助.

Thanks to @Laurent Mazuel and @Jim Xu for their help.

这是我现在正在运行的代码,以及我收到的错误:

Here's the code I'm running now, and the error I'm receiving:

client = DatabricksClient(credentials, subscription_id)
workspace_obj = client.workspaces.get("example_rg_name", "example_databricks_workspace_name")
WorkspacesOperations.create_or_update(
workspace_obj,
"example_rg_name",
"example_databricks_workspace_name",
custom_headers=None,
raw=False,
polling=True
)

错误:

TypeError:create_or_update()缺少1个必需的位置参数:"workspace_name"

由于我将工作区名称作为第三个参数,并根据

I'm a bit puzzled by that error as I've provided the workspace name as the third parameter, and according to this documentation, that's just what this method requires.

我还尝试了以下代码:

client = DatabricksClient(credentials, subscription_id)
workspace_obj = client.workspaces.get("example_rg_name", "example_databricks_workspace_name")
client.workspaces.create_or_update(
workspace_obj,
"example_rg_name",
"example_databricks_workspace_name"
)

这将导致:

 Traceback (most recent call last):
   File "./build_azure_visibility_core.py", line 112, in <module>
     ca_databricks.create_or_update_databricks(SUB_PREFIX)
   File "/home/gitlab-runner/builds/XrbbggWj/0/SA-Cloud/azure-visibility-core/expd_az_databricks.py", line 34, in create_or_update_databricks
     self.databricks_workspace_name
   File "/home/gitlab-runner/builds/XrbbggWj/0/SA-Cloud/azure-visibility-core/azure-visibility-core/lib64/python3.6/site-packages/azure/mgmt/databricks/operations/workspaces_operations.py", line 264, in create_or_update
     **operation_config
   File "/home/gitlab-runner/builds/XrbbggWj/0/SA-Cloud/azure-visibility-core/azure-visibility-core/lib64/python3.6/site-packages/azure/mgmt/databricks/operations/workspaces_operations.py", line 210, in _create_or_update_initial
     body_content = self._serialize.body(parameters, 'Workspace')
   File "/home/gitlab-runner/builds/XrbbggWj/0/SA-Cloud/azure-visibility-core/azure-visibility-core/lib64/python3.6/site-packages/msrest/serialization.py", line 589, in body
     raise ValidationError("required", "body", True)
 msrest.exceptions.ValidationError: Parameter 'body' can not be None.
 ERROR: Job failed: exit status 1

因此serialization.py中的第589行有错误.我看不到我的代码中的错误导致了什么.感谢所有慷慨协助的人!

So Line 589 in serialization.py has an error. I don't see where an error in my code is causing that. Thanks to all who have been generous to assist!

推荐答案

在@Laurent Mazuel和Microsoft的支持工程师的帮助下,我有一个解决方案:

with help from @Laurent Mazuel and support engineers at Microsoft, I have a solution:

managed_resource_group_ID = ("/subscriptions/"+sub_id+"/resourceGroups/"+managed_rg_name)
client = DatabricksClient(credentials, subscription_id)
workspace_obj = client.workspaces.get(rg_name, databricks_workspace_name)
client.workspaces.create_or_update(
    {
        "managedResourceGroupId": managed_resource_group_ID,
        "sku": {"name":"premium"},
        "location":location
    },
    rg_name,
    databricks_workspace_name
).wait()

这篇关于如何使用Azure Python SDK设置Databricks服务?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆