如何使用Pentaho验证一个csv数据与另一个csv文件的比较? [英] How to validate one csv data compare with another csv file using Pentaho?

查看:143
本文介绍了如何使用Pentaho验证一个csv数据与另一个csv文件的比较?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个csv文件. 在一个文件中,我有10行,在另一数据列表中. 我想做的是,检查第一个csv的一个文件的数据,并将其与另一个csv文件进行比较. 那么我该如何实现呢? 任何帮助都会很棒.

解决方案

您要查找的步骤称为Stream Lookup步骤.

读取CSV和参考文件,并将这两个流放在Stream Lookup中,并将其设置如下: a)查找步骤=读取参考的步骤 b)键/字段= CSV的字段名称,其中包含能够识别参考文件中该行的任何字段. c)关键字/查找字段=参考文件中字段的名称. d)要检索的字段=要返回的引用中的字段名称(可以是标识符或您需要的任何其他字段) e)要检索的字段/类型=不要忘记!

就像这样,您将在参考文件中添加一列到CSV文件的10行中.然后,可以通过测试新列的值是否不为空来过滤掉查找未找到的行.

与在PDI中一样,上述所有设置均以下拉列表为指导,这需要2分钟.

I have two csv file . In one file i have 10 rows and in another list of data . What i want to do is , check the data of one filed of first csv and compare it with another csv file . So how can i achieve this ? Any help would be great .

解决方案

The step you are looking for is named the a Stream Lookup step.`

Read you CSV and the reference files, and drop the two flows in a Stream Lookup and set it up as follow: a) Lookup step = the step that reads the reference b) Keys / field = the name of field of the CSV that contains any field able to identify the row in the reference file. c) Keys / Lookup field = the name of the field in the reference file. d) Field to retrieve = the name of the field in the reference to return (may be the identifier or any other field you need) e) Field to retrieve / Type = Do not forget !

Like that, you will add a column from the reference file to the 10 rows of the CSV file. You may then filter out the rows which the Lookup did not found by testing if the value of the new column is not null.

As in the PDI all the above setup are guided with drop down lists, it should take you 2 minutes.

这篇关于如何使用Pentaho验证一个csv数据与另一个csv文件的比较?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆