Azure Databricks群集初始化脚本-安装python wheel [英] Azure Databricks cluster init script - install python wheel

查看:193
本文介绍了Azure Databricks群集初始化脚本-安装python wheel的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个python脚本,该脚本在databricks中安装一个存储帐户,然后从该存储帐户安装一个转轮.我正在尝试将其作为群集初始化脚本运行,但是它一直失败.我的脚本的格式为:

I have a python script that mounts a storage account in databricks and then installs a wheel from the storage account. I am trying to run it as a cluster init script but it keeps failing. My script is of the form:

#/databricks/python/bin/python
mount_point = "/mnt/...."
configs = {....}
source = "...."
if not any(mount.mountPoint == mount_point for mount in dbutils.fs.mounts()):
  dbutils.fs.mount(source = source, mount_point = mount_point, extra_configs = configs)
dbutils.library.install("dbfs:/mnt/.....")
dbutils.library.restartPython()

当我直接在笔记本中直接运行它时它可以工作,但是如果我保存到名为dbfs:/databricks/init_scripts/datalakes/init.py的文件并将其用作群集初始化脚本,则群集无法启动,并且错误消息指出该初始化脚本具有非-零退出状态.我检查了日志,似乎它以bash而不是python的形式运行:

It works when I run it in directly in a notebook but if I save to a file called dbfs:/databricks/init_scripts/datalakes/init.py and use it as cluster init script, the cluster fails to start and the error message says that the init script has a non-zero exit status. I've checked the logs and it appears that it is running as bash instead of python:

bash: line 1: mount_point: command not found

我尝试从包含以下一行的名为init.bash的bash脚本运行python脚本:

I have tried running the python script from a bash script called init.bash containing this one line:

/databricks/python/bin/python "dbfs:/databricks/init_scripts/datalakes/init.py"

然后使用init.bash的群集无法启动,日志显示找不到python文件:

Then the cluster using init.bash fails to start, with the logs saying it can't find the python file:

/databricks/python/bin/python: can't open file 'dbfs:/databricks/init_scripts/datalakes/init.py': [Errno 2] No such file or directory

有人可以告诉我如何使它工作吗?

Can anyone tell me how I could get this working please?

相关问题: Azure Databricks群集初始化脚本-从已安装的存储设备中安装轮子

推荐答案

我采用的解决方案是运行一个笔记本,该笔记本用于装载存储并创建一个bash初始化脚本,该脚本仅用于安装方向盘.像这样:

The solution I went with was to run a notebook which mounts the storage and creates a bash init script that just installs the wheel. Something like this:

mount_point = "/mnt/...."
configs = {....}
source = "...."
if not any(mount.mountPoint == mount_point for mount in dbutils.fs.mounts()):
  dbutils.fs.mount(source = source, mount_point = mount_point, extra_configs = configs)

dbutils.fs.put("dbfs:/databricks/init_scripts/datalakes/init.bash",""" 
        /databricks/python/bin/pip install "../../../dbfs/mnt/package-source/parser-3.0-py3-none-any.whl"""", True)"

这篇关于Azure Databricks群集初始化脚本-安装python wheel的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆