塔伦德的亲子关系 [英] Parent-Child relationship in Talend

查看:137
本文介绍了塔伦德的亲子关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

面临的问题以及如何在塔伦德实现亲子关系的想法.

Facing problem and out of ideas on figuring on how to implement parent-child relationship in Talend.

问题陈述:

具有一个提要文件,该提要文件具有以下格式的数据

Having a feed file which has data in below format

MemberCode|LastName|FirstName
A|SHINE|MICHAEL 
B|SHINE|MICHELLE 
C|SHINE|ERIN 
A|RODRIGUEZ|DAMIAN 
A|PAVELSKY|STEPHEN        
B|PAVELSKY|TERESA

(有更多列和更多行-仅有几行供参考). 姓和名是不言自明的. MemberCode表示关系. A将是父母,B或C将是孩子.对于某些员工记录,数据将始终采用顺序方式-意味着完整的父子数据将位于连续的行中.

(there are many more columns and many more rows - just few rows for reference purpose). LastName and FirstName are self-explanatory. MemberCode denotes the relationship. A will be parent, B or C will be child. For a certain employee record the data will always be in sequential manner - meaning the complete parent-child data will be in continuous rows.

预期结果:

以上数据需要以以下格式输出:

The above data needs to be outputed in below format:

  MemberCode|MemberLastName|MemberFirstName|DependentLastName|DependentFirstName
A         |SHINE         |MICHAEL        |                 |                  
B         |SHINE         |MICHAEL        |SHINE            |MICHELLE          
C         |SHINE         |MICHAEL        |SHINE            |ERIN              
A         |RODRIGUEZ     |DAMIAN         |                 |                  
A         |PAVELSKY      |STEPHEN        |                 |                  
B         |PAVELSKY      |STEPHEN        |PAVELSKY         |TERESA            

到目前为止,我已经尝试过:

Talend作业包含以下组件:tFileInputDelimited->tMap->tLogRow 并且tMap具有以下逻辑- 这给了我如下输出-

The Talend job is having these components: tFileInputDelimited->tMap->tLogRow And tMap has the below logic - which gives me output like below -

MemberCode|MemberLastName|MemberFirstName|DependentLastName|DependentFirstName
A         |SHINE         |MICHAEL        |                 |                  
B         |              |               |SHINE            |MICHELLE          
C         |              |               |SHINE            |ERIN              
A         |RODRIGUEZ     |DAMIAN         |                 |                  
A         |PAVELSKY      |STEPHEN        |                 |                  
B         |              |               |PAVELSKY         |TERESA

如何为具有MemberCode B或C的行复制MemberFirstName和MemberLastName的值.在此先感谢.

How to replicate the value for MemberFirstName and MemberLastName for MemberCode A for the rows having MemberCode B or C. Thanks in advance.

平台:塔伦Open Studio for Data Integration版本:6.5.1

Platform: Talend Open Studio for Data Integration Version: 6.5.1

推荐答案

这是我整理的解决方案:

Here's the solution I put together:

您需要根据其MemberCode将行分为父母和孩子.您将父文件写入到文件中,而DependentLastNameDependentFirstName为空,同时将父文件信息保存到tSetGlobalVar中的全局变量(ParentLastNameParentFirstName)中.

You need to split your rows into parents and children based on their MemberCode. You write the parents to file with DependentLastName and DependentFirstName being empty, while saving the parent info to global variables (ParentLastName and ParentFirstName) in a tSetGlobalVar.

当您移动到下一行(即子行)时,您的父级已被保存,因为它始终是组中的第一行.因此,您可以使用子级输出中的全局变量来检索其名字和姓氏,并将其写入相同的物理文件中.

When you move to the next row, which is a child row, your parent has already been saved as it's always the first in the group. So you can retrieve its first and last name using the global variables in the children output, and write this to the same physical file.

两个tFileOutputDelimited组件均具有相同的设置;它们处于附加模式,并且将选项Custom the flush buffer size设置为1(这对于使行以正确的顺序排序很重要).

Both tFileOutputDelimited components have identical settings; they are in append mode, and have the option Custom the flush buffer size set to 1 (this is important in order to keep the rows sorted in the right order).

这篇关于塔伦德的亲子关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆