如何在数据块上的ML运行时环境中启用GPU可见? [英] how to enable GPU visible for ML runtime environment on databricks?

查看:86
本文介绍了如何在数据块上的ML运行时环境中启用GPU可见?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在数据砖/GPU(p2.xlarge)上以以下环境运行一些TensorFlow(2.2)示例代码:

I am trying to run some TensorFlow (2.2) example code on databricks/GPU (p2.xlarge) with environment as:

6.6 ML, spark 2.4.5, GPU, Scala 2.11  
Keras version : 2.2.5

nvidia-smi
NVIDIA-SMI 440.64.00    Driver Version: 440.64.00    CUDA Version: 10.2         

我已经检查了但是,我不想每次重新启动数据块GPU集群时都运行shell命令.

But, I do not want to run the shell commands every time the databricks GPU clusters is restarted.

所以,我通过databricks libs UI安装了TensorFlow,

so, I installed TensorFlow from databricks libs UI by

  tensorflow==2.2.*

我不表示它适用于GPU或CPU.我认为默认情况下它是针对GPU的.

I do not indicate it is for GPU or CPU. I assume that it is for GPU by default.

我发现python3代码仅在CPU上而不在GPU上运行.

I found that the python3 code is only run on CPUs not on GPU.

  import tensorflow as tf

  physical_devices = tf.config.list_physical_devices()
  physical_devices : [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:XLA_CPU:0', device_type='XLA_CPU'), PhysicalDevice(name='/physical_device:XLA_GPU:0', device_type='XLA_GPU')]


  visible_devices = tf.config.get_visible_devices()

  visible devices: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

  tf.test.gpu_device_name() # return empty string


  is_built_with_cuda: True
  is_built_with_gpu_support: True
  is_built_with_rocm: False
  is_built_with_xla: True
  get_soft_device_placement : True

我正在尝试将ML运行时设置为可见的'XLA_GPU':

I am trying to set the 'XLA_GPU' visible to the ML runtime:

# https://www.tensorflow.org/api_docs/python/tf/config/set_visible_devices
# set GPU visible for TF runtime
physical_devices = tf.config.list_physical_devices('XLA_GPU')
try:
    # enable first GPU
    tf.config.set_visible_devices(physical_devices[0], 'XLA_GPU') # exception here !!!
    logical_devices = tf.config.list_logical_devices('XLA_CPU')
    # Logical device was created for first GPU
    assert len(logical_devices) == len(physical_devices) 
except:
    # Invalid device or cannot modify virtual devices once initialized.
    print('Invalid device or cannot modify virtual devices once initialized.')

但是,我有例外.

如何启用GPU,以便TF代码可以在其上运行?

How to enable GPU so that TF code can run on it ?

谢谢

推荐答案

安装 tensorflow-gpu 而不是tensorflow,因为它将主要在gpu上运行,而tensorflow将主要在cpu上运行.您无需编辑代码,因为它仍可以通过别名 tensorflow

Install tensorflow-gpu instead of tensorflow, as that will run primarily on gpu while tensorflow will run primarily on cpu. You won't need to edit the code as it still imports by the alias tensorflow

这篇关于如何在数据块上的ML运行时环境中启用GPU可见?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆