CHAR语义和ORA-01461 [英] CHAR semantics and ORA-01461

查看:106
本文介绍了CHAR语义和ORA-01461的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我维护一个带有Oracle后端(OCI8函数)的PHP驱动的应用程序.该应用程序是使用Oracle 10g XE开发的,可以在客户拥有的任何版本上进行部署.

I maintain a PHP driven application with Oracle backend (OCI8 functions). The app is developed with Oracle 10g XE and deployed on whatever version the customer owns.

该应用程序处理单字节文本(ISO-8859-15),并且在针对西欧版本的Oracle XE进行开发时,我从未遇到任何问题.但是,我最近安装了 Universal 版本,并且在插入带有非ASCII字符的大字符串时遇到了问题.此版本设置NLS_CHARACTERSET = AL32UTF8;由于我的应用程序使用WE8ISO8859P15,因此Oracle默默地将我的输入数据从ISO-8859-15转换为UTF-8(很好).但是似乎某些大小检查会出错:具有1500个字符的字符串(ISO-8889-15中为1500个字节,UTF-8中为4500个字节)似乎溢出了VARCHAR2(4000 CHAR)列.

The application handles single-byte text (ISO-8859-15) and I've never had any problem while developing against the Western European edition of Oracle XE. However, I've recently installed the Universal edition and I'm having issues when inserting large strings with non-ASCII chars. This version sets NLS_CHARACTERSET = AL32UTF8; since I my app uses WE8ISO8859P15 Oracle silently converts my input data from ISO-8859-15 to UTF-8 (which is fine). But it seems that certain size checks go wrong: a string with 1500 characters (1500 bytes in ISO-8889-15, 4500 bytes in UTF-8) appear to overflow a VARCHAR2(4000 CHAR) column.

我已经创建了这个测试表:

I've created this test table:

CREATE TABLE FOO (
    FOO_ID NUMBER NOT NULL ENABLE,
    DATA_BYTE VARCHAR2(4000 BYTE),
    DATA_CHAR VARCHAR2(4000 CHAR),

    CONSTRAINT FOO_PK PRIMARY KEY (FOO_ID)
);

可以使用以下代码重现该问题:

The problem can be reproduced with this code:

<?php
$connection = oci_connect(DB_USER, DB_PASS, DB_CONN_STRING, 'WE8ISO8859P15');
if( !$connection ){
    $e = oci_error();
    die(htmlspecialchars($e['message']));
}

$id = 1;
$data = str_repeat('€', 1500);

$sql = 'INSERT INTO FOO (FOO_ID, DATA_CHAR) ' .
    'VALUES (:id, :data)';
$res = oci_parse($connection, $sql);
if(!$res){
    $e = oci_error();
    die(htmlspecialchars($e['message']));
}
if(!oci_bind_by_name($res, ':id', $id)){
    $e = oci_error();
    die(htmlspecialchars($e['message']));
}
if(!oci_bind_by_name($res, ':data', $data)){
    $e = oci_error();
    die(htmlspecialchars($e['message']));
}
if(!oci_execute($res, OCI_COMMIT_ON_SUCCESS)){
    $e = oci_error();
    die(htmlspecialchars($e['message']));
}

...会触发:

警告:oci_execute(): ORA-01461:勇敢的灵魂之歌 长篇文章 长

Warning: oci_execute(): ORA-01461: sólo puede enlazar un valor LONG para insertarlo en una columna LONG

这是我尝试插入4001字符字符串时遇到的相同错误.如果我插入xxx...而不是€€€ 不会发生,并且如果我将脚本另存为UTF-8并这样连接也不会发生:

It is the same error I get when I try to insert a 4001 char string. It doesn't happen if I insert xxx... instead of €€€ and it doesn't happen if I save my script as UTF-8 and connect as such:

<?php
$connection = oci_connect(DB_USER, DB_PASS, DB_CONN_STRING, 'AL32UTF8');

[更新:我的测试存在缺陷.使用UTF-8不能避免使用ORA-01461]

[Update: My test was flawed. Using UTF-8 doesn't avoid ORA-01461]

如何解决此问题? NLS_CHARACTERSET数据库参数不是我控制的,将我的应用切换到UTF-8可能会导致其他问题(几乎所有客户都有单字节数据库).

How can I override this problem? The NLS_CHARACTERSET database parameter is not something I control and switching my app to UTF-8 is likely to cause other problems (almost all our customers have single byte databases).

推荐答案

除非您要使用CLOB而不是VARCHAR2,否则可能无法解决此问题.

This is probably not something that you can work around unless you want to use a CLOB instead of a VARCHAR2.

在Oracle中,当声明一列时,默认设置是使用字节长度语义.因此,例如,VARCHAR2(100)分配100个字节的存储空间.如果您使用的是ISO 8859-1之类的单字节字符集,则每个字符都需要1个字节的存储空间,因此这也会为100个字符分配空间.但是,如果您使用的是UFT-8这样的多字节字符集,则每个字符可能需要1到4个字节的存储空间.因此,根据数据的不同,VARCHAR2(100)可能只能存储25个字符的数据(英语字符通常需要1个字节,欧洲字符通常需要2个字节,而亚洲字符通常需要3个字节).

In Oracle, when you declare a column, the default is to use byte-length semantics. So a VARCHAR2(100), for example, allocates 100 bytes of storage. If you're using a single-byte character set like ISO 8859-1, every character requires 1 byte of storage, so this also allocates space for 100 characters. But if you are using a multi-byte character set like UFT-8, each character can require between 1 and 4 bytes of storage. Depending on the data, therefore, a VARCHAR2(100) may only be able to store 25 characters of data (English characters generally require 1 byte, European characters generally require 2 bytes, and Asian characters generally require 3 bytes).

您可以告诉Oracle使用字符长度语义,这通常是我从ISO-8859-1数据库迁移到UTF-8数据库时的建议.如果声明一列VARCHAR2(100 CHAR),则Oracle将为100个字符分配空间,而不管其最终是100个字节还是400个字节.您还可以将NLS_LENGTH_SEMANTICS参数设置为CHAR来更改默认值(对于新的DDL),以便VARCHAR2(100)分配100个存储字符而不是100个字节.

You can tell Oracle to use character length semantics which is normally what I'd suggest when moving from an ISO-8859-1 database to a UTF-8 database. If you declare a column VARCHAR2(100 CHAR), Oracle will allocate space for 100 characters regardless of whether that ends up being 100 bytes or 400 bytes. You can also set the NLS_LENGTH_SEMANTICS parameter to CHAR to change the default (for new DDL) so that a VARCHAR2(100) allocates 100 characters of storage rather than 100 bytes.

但是,不幸的是,Oracle VARCHAR2的大小限制(在SQL引擎而不是PL/SQL引擎的上下文中)为4000字节.因此,即使您声明了VARCHAR2(4000 CHAR)列,也仍然只能实际插入4000字节的数据,而该数据可能少至1000个字符.例如,在使用AL32UTF8字符集的数据库中,我可以声明VARCHAR2(4000 CHAR)列,但是插入需要2个字节存储空间的字符表明我实际上不能插入4000个字符的数据

Unfortunately for you, though, the limit on the size of an Oracle VARCHAR2 (in the context of the SQL engine rather than the PL/SQL engine) is 4000 bytes. So even if you declare a column VARCHAR2(4000 CHAR), you're still going to be limited to actually inserting 4000 bytes of data which may be as few as 1000 characters. For example, in a database using the AL32UTF8 character set, I can declare a column VARCHAR2(4000 CHAR) but inserting a character that requires 2 bytes of storage shows that I can't really insert 4000 characters of data

SQL> create table foo (
  2    col1 varchar2(4000 char)
  3  );

Table created.

SQL> insert into foo values( rpad( 'abcde', 4000, unistr('\00f6') ) );

1 row created.

SQL> ed
Wrote file afiedt.buf

  1* insert into foo values( rpad( 'abcde', 6000, unistr('\00f6') ) )
SQL> /

1 row created.

SQL> select length(col1), lengthb(col1)
  2    from foo;

LENGTH(COL1) LENGTHB(COL1)
------------ -------------
        2003          4000
        2003          4000

如果您需要存储4000个字符的UTF-8数据,则需要一种可以处理16000字节的数据类型,这有必要移至CLOB.

If you need to store 4000 characters of UTF-8 data, you'd need a data type that could handle 16000 bytes which would necessitate moving to a CLOB.

这篇关于CHAR语义和ORA-01461的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆