oracle - 将许多日期格式转换为单个格式化日期 [英] oracle - convert many date formats to a single formatted date

查看:7

问题描述

我想将包含日期的字符串转换为单一格式的日期.例如:

I want to bring a string which contains a date to a single format date. EX:

  • 13-06-2012 至 13-JUN-12
  • 13/06/2012 至 13-JUN-12
  • 2012 年 6 月 13 日至2012 年 6 月 13 日
  • 2012 年 6 月 13 日至 2012 年 6 月 13 日
  • ...

我尝试删除所有特殊字符,然后使用函数将该字符串转换为单一格式的日期.我的函数返回更多异常,我不知道为什么......

I tried to delete all special characters and after that use a function to transform that string into a single format of date. My function return more exceptions, I don't know why...

功能:

CREATE OR REPLACE FUNCTION normalize_date (data_in IN VARCHAR2)
    RETURN DATE
IS
    tmp_month         VARCHAR2 (3);
    tmp_day           VARCHAR2 (2);
    tmp_year          VARCHAR2 (4);
    TMP_YEAR_NUMBER   NUMBER;
    result            DATE;
BEGIN
    tmp_day := SUBSTR (data_in, 1, 2);
    tmp_year := SUBSTR (data_in, -4);

    --if(REGEXP_LIKE(SUBSTR(data_in,3,2), '[:alpha:]')) then 
    if(SUBSTR(data_in,3,1) in ('a','j','i','f','m','s','o','n','d','A','J','I','F','M','S','O','N','D')) then      
    tmp_month := UPPER(SUBSTR (data_in, 3, 3));
    else
    tmp_month := SUBSTR (data_in, 3, 2);
    end if;

    DBMS_OUTPUT.put_line (tmp_year);

    TMP_YEAR_NUMBER := TO_NUMBER (tmp_year);

    IF (tmp_month = 'JAN')
    THEN
        tmp_month := '01';
    END IF;

    IF (tmp_month = 'FEB')
    THEN
        tmp_month := '02';
    END IF;

    IF (tmp_month = 'MAR')
    THEN
        tmp_month := '03';
    END IF;

    IF (tmp_month = 'APR')
    THEN
        tmp_month := '04';
    END IF;

    IF (tmp_month = 'MAY')
    THEN
        tmp_month := '05';
    END IF;

    IF (tmp_month = 'JUN')
    THEN
        tmp_month := '06';
    END IF;

    IF (tmp_month = 'JUL')
    THEN
        tmp_month := '07';
    END IF;

    IF (tmp_month = 'AUG')
    THEN
        tmp_month := '08';
    END IF;

    IF (tmp_month = 'SEP')
    THEN
        tmp_month := '09';
    END IF;

    IF (tmp_month = 'OCT')
    THEN
        tmp_month := '10';
    END IF;

    IF (tmp_month = 'NOV')
    THEN
        tmp_month := '11';
    END IF;

    IF (tmp_month = 'DEC')
    THEN
        tmp_month := '12';
        END IF;

   -- dbms_output.put_line(tmp_day || '~'||tmp_year || '~' ||tmp_month);

    IF (LENGTH (tmp_day || tmp_year || tmp_month) <> 8)
    THEN
        result := TO_DATE ('31122999', 'DDMMYYYY');
        RETURN result;
    END IF;

 --   dbms_output.put_line('before end');
    result:=TO_DATE (tmp_day || tmp_month ||tmp_year , 'DDMMYYYY');
 --   dbms_output.put_line('date result: '|| result);
    RETURN result;
EXCEPTION
    WHEN NO_DATA_FOUND
    THEN
        NULL;
    WHEN OTHERS
    THEN
        result := TO_DATE ('3012299', 'DDMMYYYY');
        RETURN result;
        RAISE;
END normalize_date;

用法

SELECT customer_no,
       str_data_expirare,
       normalize_date (str_data_expirare_trim) AS data_expirare_buletin
  FROM (SELECT customer_no,
               str_data_expirare,
               REGEXP_REPLACE (str_data_expirare, '[^a-zA-Z0-9]+', '')
                   AS str_data_expirare_trim
          FROM (SELECT Q1.set_act_id_1,
                       Q1.customer_no,
                       NVL (SUBSTR (set_act_id_1,
                                      INSTR (set_act_id_1,
                                             '+',
                                             1,
                                             5)
                                    + 1,
                                    LENGTH (set_act_id_1)),
                            'NULL')
                           AS str_data_expirare
                  FROM STAGE_CORE.IFLEX_CUSTOMERS Q1
                  WHERE Q1.set_act_id_1 IS NOT NULL
                  )
        );

推荐答案

如果您对所有可能的日期格式有一个合理的想法,那么使用蛮力可能会更容易:

If you have a sound idea of all the possible date formats it might be easier to use brute force:

create or replace function clean_date
    ( p_date_str in varchar2)
    return date
is
    l_dt_fmt_nt sys.dbms_debug_vc2coll := sys.dbms_debug_vc2coll
        ('DD-MON-YYYY', 'DD-MON-YY', 'DD-MM-YYYY', 'MM-DD-YYYY', 'YYYY-MM-DD'
         , 'DD/MM/YYYY', 'MM/DD/YYYY', 'YYYY/MM/DD', 'DD/MM/YY', 'MM/DD/YY');
    return_value date;
begin
    for idx in l_dt_fmt_nt.first()..l_dt_fmt_nt.last()
    loop
        begin
            return_value := to_date(p_date_str, l_dt_fmt_nt(idx));
            exit;
        exception
             when others then null;
        end;
    end loop;
    if return_value is null then
        raise no_data_found; 
    end if;
    return return_value;
exception
    when no_data_found then
        raise_application_error(-20000, p_date_str|| ' is unknown date format');
end clean_date;
/

请注意,现代版本的 Oracle 对日期转换非常宽容.此函数以不在列表中的格式处理日期,产生了一些有趣的结果:

Be aware that modern versions of Oracle are quite forgiving with date conversion. This function handled dates in formats which aren't in the list, with some interesting consequences:

SQL> select  clean_date('20160817') from dual;

CLEAN_DAT
---------
17-AUG-16

SQL> select  clean_date('160817') from dual;

CLEAN_DAT
---------
16-AUG-17

SQL> 

这表明了面对松散的数据完整性规则时自动数据清理的局限性.罪的工价是损坏的数据.

Which demonstrates the limits of automated data cleansing in the face of lax data integrity rules. The wages of sin is corrupted data.

@AlexPoole 提出了使用 'RR' 格式的问题.日期掩码的这个元素是作为 Y2K 组件引入的.令人沮丧的是,在进入新千年将近 20 年后,我们仍在讨论它.

@AlexPoole raises the matter of using the 'RR' format. This element of the date mask was introduced as a Y2K kludge. It's rather depressing that we're still discussing it almost two decades into the new Millennium.

无论如何,问题是这样的.如果我们将此字符串 '161225' 转换为日期,它是哪个世纪的?那么,'yymmdd' 将给出 2016-12-15.很公平,但是 '991225' 呢?我们真正想要的日期是 2099-12-15 的可能性有多大?这就是 'RR' 格式发挥作用的地方.基本上它默认世纪:数字 00-49 默认为 20,50-99 默认为 19.这个窗口是由 Y2K 问题决定的:在 2000 年,'98 更可能是指最近的过去比不久的将来,类似的逻辑也适用于 '02.因此是 1950 年的中点.请注意,这是一个固定点,而不是滑动窗口.随着我们离 2000 年越来越远,这个支点变得越来越没用.了解更多.

Anyway, the issue is this. If we cast this string '161225' to a date what century does it have? Well, 'yymmdd' will give 2016-12-15. Fair enough, but what about '991225'? How likely is that the date we really want is 2099-12-15? This is where the 'RR' format comes into play. Basically it defaults the century: numbers 00-49 default to 20, 50-99 default to 19. This window was determined by the Y2K issue: in 2000 it was more likely that '98 referred to the recent past than the near future, and similar logic applied to '02. Hence the halfway point of 1950. Note this is a fixed point not a sliding window. As we move further from the year 2000 the less useful that pivot point becomes. Find out more.

无论如何,关键是 'RRRR' 不能很好地与其他日期格式配合使用:to_date('501212', 'rrrrmmdd') hurlsora-01843: not a valid month.因此,使用'RR'并在使用'YYYY'`之前对其进行测试.所以我修改后的函数(经过一些整理)看起来像这样:

Anyway, the key point is that 'RRRR' does not play nicely with other date formats: to_date('501212', 'rrrrmmdd') hurlsora-01843: not a valid month. So, use'RR'and test for it before using'YYYY'`. So my revised function (with some tidying up) looks like this:

create or replace function clean_date
    ( p_date_str in varchar2)
    return date
is
    l_dt_fmt_nt sys.dbms_debug_vc2coll := sys.dbms_debug_vc2coll
        ('DD-MM-RR', 'MM-DD-RR', 'RR-MM-DD', 'RR-DD-MM'
         , 'DD-MM-YYYY', 'MM-DD-YYYY', 'YYYY-MM-DD', 'YYYY-DD-MM');
    return_value date;
begin
    for idx in l_dt_fmt_nt.first()..l_dt_fmt_nt.last()
    loop
        begin
            return_value := to_date(p_date_str, l_dt_fmt_nt(idx));
            exit;
        exception
             when others then null;
        end;
    end loop;
    if return_value is null then
        raise no_data_found; 
    end if;
    return return_value;
exception
    when no_data_found then
        raise_application_error(-20000, p_date_str|| ' is unknown date format');
end clean_date;
/

关键点仍然存在:在解释日期时,我们可以使此功能变得多么聪明,因此请确保您以最合适的方式领先.如果您认为大多数日期字符串都适合日-月-年,请将其放在首位;你仍然会得到一些错误的演员表,但如果你以年-月-日为首,则更少.

The key point remains: there's a limit to how smart we can make this function when it comes to interpreting dates, so make sure you lead with the best fit. If you think most of your date strings fit day-month-year put that first; you will still get some wrong casts but less that if you lead with year-month-day.

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆