删除 SSIS 数据流中的重复项 [英] Remove duplicates in SSIS Data Flow

查看:38
本文介绍了删除 SSIS 数据流中的重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理 SSIS 数据流任务.

I am working on an SSIS data flow task.

源表来自非规范化的旧数据库.

The source table is from old database which is denormalized.

目标表已规范化.

SSIS 失败,因为由于重复(主键列中的重复)而无法进行数据传输.

SSIS fails because the data transfer is not possible because of duplicates (duplicates in primary key column).

如果 SSIS 可以检查当前记录的可用性(通过检查密钥)并且如果它存在,它可以忽略推送它,那就太好了.然后它可以继续下一条记录.

It would be good if the SSIS can checks the destination for availability of current record (by checking the key) and if it exists , it can ignore pushing it. Then it can continue with the next record.

有没有办法处理这种情况?

Is there a way to handle this scenario?

推荐答案

假设目标表是源表的子集,您应该能够使用 排序转换 只拉入目标表所需的列,然后选中删除具有重复排序值的行"基本上根据您选择的列为您提供不同的记录列表.

Assuming your destination table is a subset of your source table, you should be able to use the Sort Transformation to pull in only the columns you need for your destination table, and then check the "Remove rows with duplicate sort values" to basically give you a distinct list of records based on the columns you selected.

然后,只需将排序结果路由到您的目的地,您就可以开始使用了.

Then, simply route the results of the sort to your destination, and you should be good to go.

这篇关于删除 SSIS 数据流中的重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆