Azure数据工厂管道“出现故障" [英] Azure Data Factory Pipeline 'On Failure'

查看:75
本文介绍了Azure数据工厂管道“出现故障"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在设置ADF管道,以将blob复制到Azure SQL数据库中.我的管道中有一个Iteration活动,在其中设置了一个计数器,仅当blob存在时才循环并进行复制.

I am setting up an ADF pipeline to copy blob into an Azure SQL DB. I have a Iteration activity in my pipeline, where I have set up a counter to loop and copy only if the blob exists.

除了一些随机的PK违规之外,这非常有效,我将不得不手动检查.因此,我编辑了管道以记录错误,然后继续.所以我就这样建立了管道. 如果由于主键冲突导致复制活动失败,则(暂时)会忽略,但使用存储过程记录详细信息并照常继续操作,即更新循环计数器以获取下一个文件夹.

This works great except for some random PK violations, which I will have to check manually. So I edited my pipeline to log the error, and continue. So I set up the pipeline as such. If the copy activity fails due to Primary Key Violation, (for now) ignore, but log the details using a stored procedure and continue as usual i.e. update the loop counter to get the next folder.

不幸的是,日志失败"的 成功 无法执行设置变量"活动.因此,它以无限循环返回,并不断返回相同的异常,但是存储过程"活动本身正在正确记录错误消息.

Unfortunately, the success of Log Failure does not execute the "Set Variable" activity. So it goes back in an infinite loop keep coming back with the same exception, but the Stored Procedure activity itself is logging the error message correctly.

如果我创建一个新的设置变量"并完全执行SetLoopVariable的操作,那似乎还可以.但这意味着我必须在此之后复制每个活动以具有两个单独的路径.我觉得这是多余的.

If I create a new "Set Variable" and do exactly what the SetLoopVariable does, it seems to be okay. but that means I have to copy every activity after that to have two separate paths. Which I feel is redundant.

背景:我的文件结构为container/YYYY/MM/dd/HH/mm,每小时至少有一个文件,但一天中的每一分钟都没有,所以我要做在尝试复制之前检查文件夹是否存在.

BACKGROUND: My file structure is container/YYYY/MM/dd/HH/mm, there will be at least one file per hour, but not for every minute of the day so I to do a check if the folder exists before attempting to copy.

推荐答案

托马斯答案是正确的.我最近有这个确切的问题.万一它对别人有帮助,我意识到这意味着箭头并不代表流程而是依赖.该框仅在完成所有前面的任务后才运行,在您的情况下这是不可能的,因为它取决于复制是成功还是失败.

Thomas answer is correct. I had this exact issue recently. In case it helps someone else, I realised it means the arrows don’t represent a flow but a dependency. The box only runs if all the preceding tasks are done, which is impossible in your case because it depends on the copy both succeeding and failing.

要解决您的情况,只需在错误处理路径中复制设置循环变量"即可.

To solve your case just duplicate the ‘set loop variable’ in your error handling path.

但是您可能会遇到与我现在在这里相同的问题

However you might then have the same problem that I now have here Azure data factory: Handling inner failure in until/for activity

这篇关于Azure数据工厂管道“出现故障"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆