Azure批处理任务依赖项:复制以前的文件 [英] Azure batch task dependencies: copy files from previous

查看:47
本文介绍了Azure批处理任务依赖项:复制以前的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Azure批处理方案,其中有一系列任务彼此接连运行.依赖关系设置正确,因此它们可以很好地相互运行.

I have a Azure Batch scenario where I have a chain of Tasks which are run after each other. Dependencies are set correctly so they run nicely after each other.

但是,在执行之前,我需要将所有文件从先前任务的文件夹复制到新任务的文件夹.我事先不知道会有多少个文件,所以我只想复制所有内容.我找不到使用批处理客户端库( https://docs.microsoft.com/zh-cn/dotnet/api/overview/azure/batch?view=azure-dotnet ).

However I need to copy all files from the previous Task's folder to the new Task's folder before execution. I do not know in advance how many and what files there will be so I just want to copy everything. I could not find a way to accomplish this with the Batch client library (https://docs.microsoft.com/en-us/dotnet/api/overview/azure/batch?view=azure-dotnet).

作为一种解决方法,我尝试向.bat文件添加一个简单的复制任务,该任务通过 commandline 执行,但是由于某种原因,它仅复制了一些文件.在一项任务中,有几百个文件要复制,并且在停止复制之前,文件要复制多少部分(没有错误)的百分比相差几%.这是我的复制命令: $"cmd/c xcopy/E/F/Y%AZ_BATCH_TASK_WORKING_DIR%\\ .. \\ .. \\ {previousTaskId} \\ wd%AZ_BATCH_TASK_WORKING_DIR%" .如果直接从VM执行所有操作,则一切正常.

As a workaround I tried adding a simple copy task to the .bat file which is executed with commandline but for some reason it only copies some of the files. In one task there are a few hundred files to copy and it varies a few % how big portion it copies before it stops copying (with no errors). This is my copy command: $"cmd /c xcopy /E /F /Y %AZ_BATCH_TASK_WORKING_DIR%\\..\\..\\{previousTaskId}\\wd %AZ_BATCH_TASK_WORKING_DIR%". Everything works correctly if performed directly from the VM.

经过检验的假设:

  • 复制将覆盖执行实际处理的.bat文件.反过来,这会中断复制.我现在已经排除了这个问题(每个任务都有一个不同名称的.bat文件)
  • 出于某种原因,复制是并行进行的.我向蝙蝠添加了时间戳回声,并且没有并行性,所以这不是原因.还尝试在xcopy之前添加 sleep 10 ,但没有任何区别.
  • 由于某些原因,
  • xcopy无法看到所有文件.添加了 dir 命令以查看其中有哪些文件,并且只能看到xcopy复制的文件.
  • 用户访问问题.没有任何意义,因为某些文件已成功复制并且没有错误.
  • Copying overwrites the .bat file which executes the actual processing. This in turn breaks the copying. I've now ruled out this problem (each task has a differently named .bat file)
  • Copying is done for some reason in parallel. I added timestamp echos to the bats and there is no parallelism so this can't be the reason. Also tried adding sleep 10 before the xcopy but didn't make any difference.
  • xcopy wouldn't see all the files for some reason. Added a dir command to see what files there are and it sees only the same files which xcopy copies.
  • user access issues. Doesn't make sense as some files are copied succesfully and there are no errors.

有什么想法吗?这听起来很简单,但是我不知道该怎么做.

Any ideas? This sounds like a trivial scenario but I just couldn't figure out how to do this.

推荐答案

原来,问题出在前一个任务:它启动了一个过程,该过程开始在后台生成文件并立即返回控件.因此,批处理引擎认为该任务已完成,并继续执行下一个任务,该任务首先是复制上一个任务生成的文件.

It turned out that the problem was in the previous Task: it launched a process which started generating the files in the background and returned control immediately. Therefore the Batch engine thought the Task had finished and continued to the next Task which was first copying the files generated by the previous Task.

虽然关于并行性的假设在回显时间戳中不可见(因此第一个Task表示已完成,而第二个Task表示已开始),但我的关于并行性的假设在一定程度上是正确的.使用 sleep 进行的实验会发现问题所在,但我要么使用了太短的睡眠延迟,要么以某种方式读取了错误的结果.

My hypothesis about parallelism was therefore partially true although it wasn't visible with echoing timestamps (first Task said it finished before second Task said it started). The experiment with sleep would've revealed the problem but I either used too short sleep delay or somehow read the results wrong.

因为我无法控制第一个任务如何启动进程,所以我现在添加了一些Windows Batch脚本以轮询 tasklist 关于进程何时结束以及解决了问题的问题.

Because I can't control how the first Task launches the process I now added some Windows Batch script to poll tasklist about when the process ends and it solved the problem.

这篇关于Azure批处理任务依赖项:复制以前的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆