Datastage Sequence 作业 - 如果这些文件位于 7 个不同的文件夹中,如何一次处理每个文件 [英] Datastage Sequence job- how to process each file at a time if those files are in 7 different folders

查看:43
本文介绍了Datastage Sequence 作业 - 如果这些文件位于 7 个不同的文件夹中,如何一次处理每个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

DataStage - 路径中有 7 个文件夹,每个文件夹中有 2 个文件.例如:这两个文件采用以下格式 - 文件名 = test_s1_YYYYMMDD.txt, test_s1_YYYYMMDD.done.这些文件的路径是 user/test/test_s1/用户/测试/test_s2/........user/test/test_s7/------这里s1,s2...s7代表不同的文件夹

DataStage - There are 7 folders in a path and in each folder there are 2 files . for eg : the 2 files are in the folllowing format- filename = test_s1_YYYYMMDD.txt, test_s1_YYYYMMDD.done. The path for these files are user/test/test_s1/ user/test/test_s2/ ... ... .. user/test/test_s7/------here s1,s2...s7 represents the different folders

在这些文件夹中存在上述 2 个文件,那么我如何处理序列作业中的每个文件?

In these folders the 2 above mentioned files are present , so how can i process each file in a sequence job?

推荐答案

根据您想对文件做什么,如何处理文件是一个重要的决定.为每个文件按顺序启动一个(或多个)作业可能会导致仅启动作业的大量开销.尝试使用顺序文件阶段在并行作业中一次加载所有文件.

Depending on what you want to do with the files, it is an important decision how to process your files. Starting a job (or more) in a sequence for each file can lead to heavy overhead for just starting the jobs. Try loading all files at once in a parallel job using the sequenial file stage.

在 Sequential File Stage 中,设置适当的格式.您还可以将所有内容设置为 none 以将每一行放在一列中并在以后的工作中进行处理.这将使阅读非常灵活和宽容.如果您的文件都具有相同的结构,请根据需要定义您的列.

In the Sequential File Stage, set the appropriate Format. You can also set everything to none to just put each row in one column and process that in a later job. This will make the reading very flexible and forgiving. If your files are all the same structure, define your columns as needed.

要选择文件,请使用文件模式.在顺序文件阶段的选项中,选择有一个文件名列,以便您可以在以后的作业中处理文件名.您可能还想添加一个行号列.

To select the files, use File Patterns. In the Options of the Sequential File Stage, choose to have a File Name Column so you can process the filenames in a later job. You might also want to add a Row Number Column.

这种方法的效果非常快.

This method works pretty fast.

这篇关于Datastage Sequence 作业 - 如果这些文件位于 7 个不同的文件夹中,如何一次处理每个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆