我们如何只读取新文件. [英] How can we read only new files.??

查看:75
本文介绍了我们如何只读取新文件.的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用ADF从不同的源(在这种情况下为SFTP和AWS S3)读取并将文件加载到ADLS.

I am trying to use ADF to read from different sources ( SFTP and AWS S3 in this case) and load the files on to ADLS.

我能够在第一次跑步时做到这一点.但是在第二次运行时,它试图再次从源中加载所有文件.

I am able to do that on first run. But on the second run its trying to load all the files from the sources again.

我只需要加载新的\更新的文件.如何使用ADF做到这一点??

i only need to load the new\updated files.How can i achieve this using ADF.??

我目前可以使用其他工具(例如NiFi)做到这一点,因为它可以保持状态,而看起来ADF不会保持该状态,并且它在以后运行时会读取所有文件.一个常见的用例,我希望其他人也可以这样做.

I am currently able to do that using other tools (ex NiFi) as it keeps the state , where as looks like ADF wont keep the state and it reads all the files when it runs subsequent times.  This is a common use case and i hope others are doing this.

请帮助.

此致

Sai

推荐答案

您好Saikrishna,

Hi Saikrishna,

我们在ADF复制活动中具有一项功能,可以按文件的上次修改时间过滤源文件,并且应该在1或2周内在Prod中可用.您能否再次检查此功能是否可以满足您的要求?

We are having a feature in ADF Copy activity to filter source files by files' last modified time, and it should be available in Prod in 1 or 2 weeks. Could you double check whether this feature could serve your requirement?


这篇关于我们如何只读取新文件.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆