如何克服Netezza缺乏独特的约束/参照完整性执行? [英] How to overcome Netezza's lack of unique constraint/referential integrity enforcement?

查看:256
本文介绍了如何克服Netezza缺乏独特的约束/参照完整性执行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎缺乏对强制执行2个基本约束(唯一和外键)的支持,这是丢失许多工时调试和解决困难问题的原因。在我们的应用程序中,甚至在我们的硬件中,什么开始作为一个微不足道,易于解决的问题(双行/不一致的主要细节表)未被注意到,增长和提出复杂的边缘场景(例如自连接与dups

It appears that the lack of support for enforcing 2 basic constraints (unique and foreign key), is a reason for loss of many man-hours debugging and troubleshooting difficult problems. What starts as a trivial, easy-to-fix problem (dup rows /inconsistent primary-detail tables) goes un-noticed, grows and raises complex edge-scenarios in our application, or even in our hardware (e.g. self-join with dups can cause inflation and storage exhaustion).


  • Netezza在我们的环境中提供多种用途:生产,研究,qa和分期。自然,我们的ETL过程不能成熟,无法验证所有这些情况下的所有约束。

  • 即使在我们生产中使用的最成熟的应用程序中, ,我们创建一系列表,每个表的计算结果对其前身。有时,数据完整性在一开始就被破坏,而不是在开始时(作为错误语句的结果)

推荐一种避免这些头痛的方法/工具?

Can anyone recommend a methodology/tool for avoiding these headaches?

推荐答案

我们最初编写了一个存储过程来处理这个数据仓库中的事情。

工作在理论上,它是一种缓慢的通常在NZ上找到的巨大的表(〜10分钟为500M记录表)。

We originally wrote a stored procedure to handle this very thing in our data warehouse.
While it works in theory, it is kind of slow for huge tables commonly found on NZ (~10 mins for a 500M record table).

我将在下面解释存储过程,但是现在我想说,我们现在甚至没有利用proc,因为我们的增量/ upsert加载显式地仅插入目标数据库中不存在的记录。 (对于upsert,我们基本上只是删除存在于我们的记录集中的任何记录,在插入之前插入)。

I'll explain the stored proc below, but for now I wanted to say that we're not even leveraging the proc at the moment because our incremental/upsert loads are explicitly only inserting records that do not already exist in the target database. (For upsert, we're basically just deleting any records that exist in our set of records to insert before inserting).

这有它自己的问题,特别是因为NZ doesn像删除一样,需要不断修改表以实际回收空间,更不用说通过删除已经更改的旧记录,我们的upser可能会丢失历史列数据(我们有其他进程来加载缓慢变化的维度)

This has it's own problems, particularly because NZ doesn't like deletes and needs to have the tables constantly groomed to actually reclaim the space, not to mention the fact that our upserts might lose historical column data by deleting old records that had since been changed (we have other processes for loading slowly changing dimensions that we want to track.)

无论如何,约束存储过程如下:

Anyway, the constraint stored proc looks like:

check_constraints('table','constraint_type') returns boolean

  select constraintname, contype, attname
    from _v_relation_keydata
   where relation = upper(p_table) order by constraintname, conseq;

获取应比较的列。对于每个约束,它然后运行像:

To get the columns that should be compared. For each constraint, it then runs something like:

  select count(*) 
    from ( 
    select distinct col1,col2,...
      from p_table 
     where col1 is not null 
       and col2 is not null... );

并将此数字与

  select count(*) 
    from p_table 
   where col1 is not null 
     and col2 is not null...;

如果不同,我们会引发异常。

If they're different, we raise an exception.

这篇关于如何克服Netezza缺乏独特的约束/参照完整性执行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆