KNIME比较数据集 [英] KNIME comparing datasets

查看:84
本文介绍了KNIME比较数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请您回答:-由于我们使用KNIME来运行我们的要求.在我们的工作流程中,我们将客户数据分为2个数据库(一个oracle和一个Hive)进行比较,然后我们希望报告匹配的数据量和不匹配的数据量.因此,现在我们要根据客户的位置对一些客户ID进行分组,以查看从哪个位置我们会出现更多不匹配的情况.你能告诉我我应该使用什么节点来获得这种类型的自定义报告

Can you please answer:- Since we are using KNIME to run our as per our requirement. In our workflows , we compare customer data in 2 data bases , one oracle and one Hive and then we want report out on how much data is matched and how much is not . so now we want to group some customer ids based on locations of the customers to see from which location we are getting more mismatches. Can you tell me what all nodes should I use to get this type of customized report

推荐答案

目前尚不清楚您要如何进行比较,但是我认为您将需要 GroupBy 节点来按位置计算不匹配,但是在此之前,您应该使用例如

It is not very clear how you want to do the comparison, but I think you will need the Joiner node. After that you can use the GroupBy node to use the GroupBy node to compute the mismatches by location, but before that you should use for example a Rule Engine node to convert the missing values created by the Joiner to a value (if the original datasets have missing values in the interesting columns, you should change them before the Joiner) and all other values to a different value.

这篇关于KNIME比较数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆