数据完整性SQL查询可修复表中的错误 [英] Data Integrity SQL query to fix the bug in the table

查看:76
本文介绍了数据完整性SQL查询可修复表中的错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当前,我在维度表中的其中一个表上面临数据完整性问题,下面是详细信息

Currently i am facing a data integrity issue on one of the table from dimension table ,below are the details

create table TEST_TBL
(
JOB_ID  NUMBER(38,0),
LOC_SID NUMBER(38,0),
CITY VARCHAR2(180 BYTE),
CITY_NM_CLR VARCHAR2(50 BYTE),
CITY_NM_OFFICIAL VARCHAR2(180 BYTE),
DISTRICT VARCHAR2(120 BYTE),
DISTRICT_CD VARCHAR2(20 BYTE),
IN_CNTRY_REG VARCHAR2(100 BYTE),
CNTRY_NM VARCHAR2(255 BYTE),
POSTAL_CD VARCHAR2(15 BYTE),
POPUL_APROX VARCHAR2(15 BYTE),
LONGITUDE NUMBER,
LATITUDE NUMBER,
GLOBAL_REG VARCHAR2(40 BYTE),
CNTRY_CD_2CHAR  VARCHAR2(2 BYTE),
CNTRY_CD_3CHAR  VARCHAR2(3 BYTE),
CNTRY_CD_NUMER  VARCHAR2(3 BYTE),
CNTRY_NM_OFFICIAL VARCHAR2(255 BYTE),
CNTRY_CALL_CD VARCHAR2(10 BYTE),
BANK_CNTRY_CD NUMBER(38,0),
CONTINENT VARCHAR2(30 BYTE),
MONETARY_REG VARCHAR2(40 BYTE),
EFFECTIVE_DT_FROM DATE,
EFFECTIVE_DT_TO DATE,
VERSION NUMBER(38,0),
UPDATED_BY_ETL_JOB NUMBER(38,0)
);

Insert into TEST_TBL values (1234,'81910','N/A',null,'N/A','N/A','0','N/A','USA','N/A','0','0','0','NA','EN','ENG','001',null,'91','0','North_America',null,to_date('01.01.00 00:00:00','DD.MM.YY HH24:MI:SS'),to_date('31.12.99 23:59:59','DD.MM.YY HH24:MI:SS'),'1',null);

请注意:

唯一索引已分配给LOC_SID(主键),以下列CITY_NM_CLR,CNTRY_CD_2CHAR的非唯一索引已就位

Unique index is already assigned to LOC_SID (primary key) non unique index is already inplace for the following columns CITY_NM_CLR,CNTRY_CD_2CHAR

当前情况

当我从表中检查完整性问题时,我得到了大约300多个提示(以上是同一数据集记录)

when i am checking the integrity issue from the table i am getting many recrods approx 300+ (above is the same data set record)

下面是我用来检查完整性的SQL查询

Below is the sql query i am using to check the integrity

SELECT count(*) AS RowAffected
FROM
  (SELECT LOC_SID,
          VERSION,
          EFFECTIVE_DT_FROM,
          EFFECTIVE_DT_TO,
          CITY,
          POSTAL_CD
   FROM
     (SELECT t.*,
             LEAD(EFFECTIVE_DT_FROM, 1) OVER(PARTITION BY CITY, POSTAL_CD
                                             ORDER BY EFFECTIVE_DT_FROM) AS next_date,
             LEAD(VERSION, 1) OVER(PARTITION BY CTY, POSTAL_CD
                                   ORDER BY EFFECTIVE_DT_FROM) AS next_version
      FROM TEST_TBL t)
   WHERE valid_to != next_date
     OR VERSION = next_version)

修复后,当我运行上述查询时,它不应返回任何结果

after the fix when i run the above query it should not return any result

让我概述一下带有样本数据的计数

Let me give a overview of counts with sample data

CITY       POSTAL_CD   COUNT(*)

N/A         N/A       502

TOKYO       N/A       30

HABSIGUDA   3452      2

DELHI       N/A       4

ASSAM       1153      1

ASSAM       1290      1

ASSAM       1310      1

ASSAM       1781      1

ASSAM       1982      1

样本数据

场景-1

LOC_SID CITY    POSTAL_CD   EFFECTIVE_DT_FROM  EFFECTIVE_DT_TO  VERSION
82141   N/A     N/A         01.01.1900         31.12.2199       1
82142   N/A     N/A         01.01.1900         31.12.2199       1

我有400这样的行

场景-2

LOC_SID CITY    POSTAL_CD   EFFECTIVE_DT_FROM   EFFECTIVE_DT_TO   VERSION
46211   TOKYO    N/A        01.01.1900          31.12.2199    1
46212   TOKYO    N/A        01.01.1900          31.12.2199    1

我有30个此类行

场景-3

 LOC_SID CITY       POSTAL_CD  EFFECTIVE_DT_FROM  EFFECTIVE_DT_TO  VERSION
 57372  HABSIGUDA   3452  01.01.1900          29.12.2017             1
 61321  HABSIGUDA   3452  29.12.2017          31.12.2199             2
 67371  UPPAL       4922  01.01.1900          31.12.2199             1
 80737  HABSIGUDA   3452  01.01.1900          31.12.2199             1

我只有一行

场景-4

LOC_SID CITY    POSTAL_CD   EFFECTIVE_DT_FROM   EFFECTIVE_DT_TO  VERSION
4822    DELHI   N/A         01.01.1900          31.12.2199     1
4825    DELHI   3911        01.01.1900          31.12.2199     1
4826    DELHI   N/A         01.01.1900          31.12.2199     1
4827    DELHI   N/A         01.01.1900          31.12.2199     1
4828    DELHI   N/A         01.01.1900          31.12.2199     1
4829    DELHI   N/A         01.01.1900          31.12.2199     1

我有4行

场景-5

LOC_SID CITY    POSTAL_CD   EFFECTIVE_DT_FROM   EFFECTIVE_DT_TO  VERSION
25101   ASSAM   1153            01.01.1900          31.12.2199     1
25102   ASSAM   1153            01.01.1900          31.12.2199     1
25103   ASSAM   1290            01.01.1900          31.12.2199     1
25104   ASSAM   1290            01.01.1900          31.12.2199     1
25105   ASSAM   1310            01.01.1900          31.12.2199     1
25106   ASSAM   1310            01.01.1900          31.12.2199     1
25107   ASSAM   1781            01.01.1900          31.12.2199     1
25108   ASSAM   1781            01.01.1900          31.12.2199     1
25109   ASSAM   1982            01.01.1900          31.12.2199     1
25110   ASSAM   1982            01.01.1900          31.12.2199     1

我有300行

基本上LOC_SID是PK并且是增量的,因此在预期的版本是增量的之后,下面是执行更新语句然后在db表中的预期结果

basically LOC_SID is PK and incremental ,so after the version expected is incremental below is the expected result after the update statement is executed then in the db table

**结果(运行更新语句后)**

** results (after running the update statement )**

场景-4

LOC_SID CITY    POSTAL_CD   EFFECTIVE_DT_FROM   EFFECTIVE_DT_TO  VERSION
4822    DELHI   N/A         01.01.1900          01.01.1900         1
4825    DELHI   3911        01.01.1900          31.12.2199         1
4826    DELHI   N/A         01.01.1900          01.01.1900         2
4827    DELHI   N/A         01.01.1900          01.01.1900         3
4828    DELHI   N/A         01.01.1900          01.01.1900         4
4829    DELHI   N/A         01.01.1900          01.01.1900         5

推荐答案

获取表TEST_TBLbackup并尝试以下操作

Take a backup of your table TEST_TBL and try the below

执行以下两个UPDATE直到更新zero rows

Do the following two UPDATE till you get zero rows updated

    BEGIN
    WHILE(1=1)
    LOOP 
    UPDATE TEST_TBL a 
    set version=(select prev_version+1 from   
                             (
                            SELECT
                        t.*,
                      ROWID,
                      Lag(version, 1)
                        over(
                          PARTITION BY cty, postal_cd
                          ORDER BY effective_dt_from) AS prev_version
                         from TEST_TBL T)t
                         where a.rowid=t.rowid 
                        AND a.version <=t.prev_version)
    WHERE EXISTS (select 1 from  (
                            SELECT
                        t.*,
                      ROWID,
                      Lag(version, 1)
                        over(
                          PARTITION BY cty, postal_cd
                          ORDER BY effective_dt_from) AS prev_version
                         from TEST_TBL T)t
                         where a.rowid=t.rowid 
                        AND a.version <=t.prev_version);
    commit;
    EXIT WHEN SQL%ROWCOUNT=0; 
    END LOOP;
    END;

第二次更新

    BEGIN
    WHILE(1=1)
    LOOP
    UPDATE TEST_TBL b
    set valid_to=(select next_date from 
                             (
                            SELECT
                        t.*,
                      ROWID,
                      LEAD(EFFECTIVE_DT_FROM, 1) OVER(PARTITION BY CITY, POSTAL_CD
                                                 ORDER BY EFFECTIVE_DT_FROM) AS next_date
                         from test_tbl T)t
                         where b.rowid=t.rowid 
                        AND valid_to != next_date)
    WHERE EXISTS (select 1 from   (
                            SELECT
                        t.*,
                      ROWID,
                      LEAD(EFFECTIVE_DT_FROM, 1) OVER(PARTITION BY CITY, POSTAL_CD
                                                 ORDER BY EFFECTIVE_DT_FROM) AS next_date
                         from test_tbl T)t
                         where b.rowid=t.rowid 
                        AND valid_to != next_date);
                        commit;
    EXIT WHEN SQL%ROWCOUNT=0; 
    END LOOP;
    END;

这篇关于数据完整性SQL查询可修复表中的错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆