如何使用Databricks Activity在ADF上实现DevOps [英] How to implement DevOps on ADF with Databricks Activity

查看:65
本文介绍了如何使用Databricks Activity在ADF上实现DevOps的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在ADF上实现DevOps,并且该管道在具有从ADLS位置和SQL提取数据的活动的管道中是成功的.

I am trying to implement DevOps on ADF and it was successful with pipelines having activities which fetch data from ADLS location and SQL.

但是现在我有了一个管道,其中的一项活动是从dbfs位置运行jar文件,如下所示.

But now I have a pipeline in which one of the activity is to run a jar file from dbfs location as shown below.

此管道将运行dbfs位置中的jar文件并继续.

This pipeline will run a jar file which is in the dbfs location and proceed.

集群的连接参数如下所示.

The connection parameters for the cluster is as shown below.

在将ARM模板从dev ADF部署到具有数据砖的UAT实例的UAT实例时,我无法覆盖 arm_template_parameter.json 文件中的任何集群连接详细信息.

While deploying the ARM template from dev ADF to UAT instance, which is having UAT instance of databricks, I was not able to override any of the cluster connection details from arm_template_parameter.json file.

  1. 如何在ARM部署时为UAT/PROD环境配置工作区URL和clusterID? arm_template_parameter.json 文件中没有任何群集详细信息的条目.

  1. How to configure the workspace URL and clusterID for UAT/PROD environment at the time of ARM deployment? There is no entry for any of the cluster details in the arm_template_parameter.json file.

如第一幅图所示,如果有一个活动从DEV实例dbfs位置中选择具有系统生成的jar文件名的jar文件,那么当将该管道的ARM模板部署到其他管道中时,它会失败吗?环境?如果是这样,如何在DEV/PROD databricks dbfs位置中部署具有相同名称的相同jar文件?

As shown in the first picture, if there is an activity which picks the jar file from DEV instance dbfs loaction, with system generated jar file name, Will it fail when the ARM template for this pipeline is deployed in other environments? If so How to deploy the same jar file with same name in DEV/PROD databricks dbfs location?

任何潜在客户均表示赞赏!

Any leads appreciated!

推荐答案

在这里您要做的是

What you have to do here is modify the existing custom parameterization template to fit your needs. This template controls which ARM template parameters are generated when you publish the factory. This can be done in the Parameterization template tab in the management hub.

默认情况下,工作区名称和URL应该已经在ARM模板中生成.要将现有的群集ID包含在其中,请将 existingClusterId (链接服务中的JSON字段名称)添加到 Microsoft.DataFactory/factories/linkedServices 下的模板中.

By default, the workspace name and URL should already be generated in the ARM template. To have your existing cluster id as part of this, you add existingClusterId (the JSON field name in the linked service) to the template under Microsoft.DataFactory/factories/linkedServices.

虽然我不喜欢在该论坛上共享文档,但实际上我们在

While I don't like sharing documentation on this forum, we actually have this exact use case demoed at https://docs.microsoft.com/azure/data-factory/continuous-integration-deployment#example-parameterizing-an-existing-azure-databricks-interactive-cluster-id

这篇关于如何使用Databricks Activity在ADF上实现DevOps的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆