Oracle SQL 动态查询非 UTF-8 字符 [英] Oracle SQL Dynamic Query non UTF-8 Characters

查看:83
本文介绍了Oracle SQL 动态查询非 UTF-8 字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个查询,该查询将在表中提供所有非 UTF-8 编码的字符,而不是特定于列名.我这样做是通过比较不等于字节长度的列的长度.%1 是我想在参数中输入的表名.我正在加入 user_tab_columns 以获取 COLUMN_NAME.然后我想获取 COLUMN_NAME 结果并向下过滤以仅显示具有错误 UTF-8 数据的行(其中列的长度不等于字节长度).下面是我想出的,但它不起作用.有人可以帮我调整这个查询以获得想要的结果吗?

I am trying to write a query that will provide all non UTF-8 encoded characters in a table, that is not specific to a column name. I am doing so by comparing the length of a column not equal to the byte length. %1 is the table name I want to check entered in a parameter. I am joining to user_tab_columns to get the COLUMN_NAME. I then want to take the COLUMN_NAME results and filter down to only show rows that have bad UTF-8 data (where length of a column is not equal to the byte length). Below is what I have come up with but it's not functioning. Can somebody help me tweak this query to get desired results?

 SELECT
 user_tab_columns.TABLE_NAME,
 user_tab_columns.COLUMN_NAME AS ColumnName,
 a.*

 FROM %1 a

 JOIN user_tab_columns
 ON UPPER(user_tab_columns.TABLE_NAME) = UPPER('%1')

 WHERE (SELECT * FROM %1 WHERE LENGTH(a.ColumnName) != LENGTHB(a.ColumnName))

推荐答案

在您的查询中 LENGTH(a.ColumnName) 将表示列名称的长度,而不是该列的内容.在静态 SQL 中,您不能将一个表中的值用作另一个表中的列名.

In your query LENGTH(a.ColumnName) would represent the length of the column name, not the contents of that column. You can't use a value from one table as the column name in another table in static SQL.

这是在匿名块中使用动态 SQL 报告哪些列包含任何多字节字符的简单演示,这是比较 lengthlengthb 会告诉你的(讨论在评论中不要在这里重复):

Here's a simple demonstration of using dynamic SQL in an anonymous block to report which columns contain any multibyte characters, which is what comparing length with lengthb will tell you (discussed in comments to not rehashing that here):

set serveroutput on size unlimited
declare
  sql_str varchar2(256);
  flag pls_integer;
begin
  for rec in (
    select utc.table_name, utc.column_name
    from user_tab_columns utc
    where utc.table_name = <your table name or argument>
    and utc.data_type in ('VARCHAR2', 'NVARCHAR2', 'CLOB', 'NCLOB')
    order by utc.column_id
  ) loop

    sql_str := 'select nvl(max(1), 0) from "' || rec.table_name || '" '
      || 'where length("' || rec.column_name || '") '
      || '!= lengthb("' || rec.column_name || '") and rownum = 1';

    -- just for debugging, to see the generated query
    dbms_output.put_line(sql_str);

    execute immediate sql_str into flag;

    -- also for debugging
    dbms_output.put_line (rec.table_name || '.' || rec.column_name
      || ' flag: ' || flag);

    if flag = 1 then
      dbms_output.put_line(rec.table_name || '.' || rec.column_name
        || ' contains multibyte characters');
    end if;
  end loop;
end;
/

这使用游标循环来获取列名 - 我也包含了表名,以防您想使用通配符或删除过滤器 - 在该循环内构造一个动态 SQL 语句,将其执行到一个变量中,然后检查该变量.我留下了一些调试输出来看看发生了什么.使用创建的虚拟表:

This uses a cursor loop to get the column names - I've included the table name too in case you want to wild-card or remove the filter - and inside that loop constructs a dynamic SQL statement, executes it into a variable, and then checks that variable. I've left some debugging output in to see what's happening. With a dummy table created as:

create table t42 (x varchar2(20), y varchar2(20));
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'multibyte ' || unistr('\00FF'));

运行那个块得到输出:

anonymous block completed
select nvl(max(1), 0) from "T42" where length("X") != lengthb("X") and rownum = 1
T42.X flag: 0
select nvl(max(1), 0) from "T42" where length("Y") != lengthb("Y") and rownum = 1
T42.Y flag: 1
T42.Y contains multibyte characters

要显示实际包含多字节的值,您可以对所选值使用动态循环:

To display the actual multibyte-containing values you could use a dynamic loop over the selected values:

set serveroutput on size unlimited
declare
  sql_str varchar2(256);
  curs sys_refcursor;
  val_str varchar(4000);
begin
  for rec in (
    select utc.table_name, utc.column_name
    from user_tab_columns utc
    where utc.table_name = 'T42'
    and utc.data_type in ('VARCHAR2', 'NVARCHAR2', 'CLOB', 'NCLOB')
    order by utc.column_id
  ) loop

    sql_str := 'select "' || rec.column_name || '" '
      || 'from "' || rec.table_name || '" '
      || 'where length("' || rec.column_name || '") '
      || '!= lengthb("' || rec.column_name || '")';

    -- just for debugging, to see the generated query
    dbms_output.put_line(sql_str);

    open curs for sql_str;
    loop
      fetch curs into val_str;
      exit when curs%notfound;

      dbms_output.put_line (rec.table_name || '.' || rec.column_name
        || ': ' || val_str);
    end loop;
  end loop;
end;
/

同一张表得到:

anonymous block completed
select "X" from "T42" where length("X") != lengthb("X")
select "Y" from "T42" where length("Y") != lengthb("Y")
T42.Y: multibyte ÿ

无论如何作为起点;如果您有 CLOB 值、NVARCHAR2 或 NCLOB,则需要进行一些调整 - 例如,您可以拥有每种类型的一个局部变量,将数据类型包含在外部游标查询中,然后提取到适当的局部变量中.

As a starting point anyway; it would need some tweaking if you have CLOB values, or NVARCHAR2 or NCLOB - for example you could have one local variable of each type, include the data type in the outer cursor query, and fetch into the appropriate local variable.

这篇关于Oracle SQL 动态查询非 UTF-8 字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆