运行python作业时,如何让condor自动导入conda环境? [英] How do I have condor automatically import my conda environment when running my python jobs?

查看:120
本文介绍了运行python作业时,如何让condor自动导入conda环境?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将作业提交给condor,但它说未安装tensorboard,这是错误的,因为我在一个交互式作业中遇到了,所以它被安装了.

I am submitting my jobs to condor but it says that tensorboard is not installed which is false because I ran in on an interactive job, so it is installed.

我如何让秃鹰使用我当前的活动conda环境?

How do I have condor use my current active conda environment?

我的秃鹰提交脚本:

####################
#
# Experiments script
# Simple HTCondor submit description file
#
# reference: https://gitlab.engr.illinois.edu/Vision/vision-gpu-servers/-/wikis/HTCondor-user-guide#submit-jobs
#
# chmod a+x test_condor.py
# chmod a+x experiments_meta_model_optimization.py
# chmod a+x meta_learning_experiments_submission.py
# chmod a+x download_miniImagenet.py
#
# condor_submit -i
# condor_submit job.sub
#
####################

# Executable   = meta_learning_experiments_submission.py
# Executable = automl-proj/experiments/meta_learning/meta_learning_experiments_submission.py
# Executable = ~/automl-meta-learning/automl-proj/experiments/meta_learning/meta_learning_experiments_submission.py
Executable = /home/miranda9/automl-meta-learning/automl-proj/experiments/meta_learning/meta_learning_experiments_submission.py

## Output Files
Log          = condor_job.$(CLUSTER).log.out
Output       = condor_job.$(CLUSTER).stdout.out
Error        = condor_job.$(CLUSTER).err.out

# Use this to make sure 1 gpu is available. The key words are case insensitive.
REquest_gpus = 1
# requirements = ((CUDADeviceName = "Tesla K40m")) && (TARGET.Arch == "X86_64") && (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory) && (TARGET.Cpus >= RequestCpus) && (TARGET.gpus >= Requestgpus) && ((TARGET.FileSystemDomain == MY.FileSystemDomain) || (TARGET.HasFileTransfer))
# requirements = (CUDADeviceName == "Tesla K40m")
# requirements = (CUDADeviceName == "Quadro RTX 6000")
requirements = (CUDADeviceName != "Tesla K40m")

# Note: to use multiple CPUs instead of the default (one CPU), use request_cpus as well
Request_cpus = 8

# E-mail option
Notify_user = me@gmail.com
Notification = always

Environment = MY_CONDOR_JOB_ID= $(CLUSTER)

# "Queue" means add the setup until this line to the queue (needs to be at the end of script).
Queue

我的提交脚本的前几行,直到失败行:

first few lines of my submission script until the failure line:

#!/home/miranda9/.conda/bin/python3.7

import torch
import torch.nn as nn
import torch.optim as optim
# import torch.functional as F
from torch.utils.tensorboard import SummaryWriter


相关评论:


Related comments:

我确实看到了这个问题如何在Condor上运行python程序?和此 http://chtc.cs.wisc.edu/python-jobs.shtml ,但我不敢相信我们必须这样做.集群中的其他所有人都没有做任何复杂的事情,因此我已经运行了我的脚本,而不必做任何复杂的事情,我非常对此表示怀疑.>

I did see this question how to run a python program on Condor? and this http://chtc.cs.wisc.edu/python-jobs.shtml but I can't believe we have to do that. Everyone else in the cluster doesn't do anything that complicated and I have run my scripts before without having to do anything complicated, I am very skeptical this is needed.

推荐答案

HTCondor在交互式作业和批处理作业中使用不同的默认环境.交互式作业复制与登录会话相同的外壳环境(包括激活的conda环境).批处理作业以非常精简的环境开始(要查看实际情况,请尝试使用/usr/bin/env 作为可执行文件运行测试作业);激活的conda环境不会被带入批处理作业环境中.

HTCondor uses different default environments in interactive and batch jobs. Interactive jobs replicate the same shell environment as your login session (including the activated conda environment). Batch jobs begin with a VERY pared down environment (to see this in action, try running a test job with /usr/bin/env as the executable); an activated conda environment would not be carried forward into the batch job environment.

此行为和潜在的提交文件解决方案在HTCondor手册中进行了描述:

This behavior and potential submit file solutions are described here in the HTCondor manual: https://htcondor.readthedocs.io/en/latest/users-manual/services-for-jobs.html?highlight=environment#environment-variables

这篇关于运行python作业时,如何让condor自动导入conda环境?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆