Oracle使用NLS_SORT和简单的regexp_like的令人困惑的行为 [英] Oracle's puzzling behaviour with NLS_SORT and a simple regexp_like

查看:92
本文介绍了Oracle使用NLS_SORT和简单的regexp_like的令人困惑的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

今天早上我从Oracle遇到了一个奇怪的行为...而且我从文档中无法理解为什么这样做.很长的帖子,我很抱歉,但是我想确保我了解我的意思. 哦,请务必在回答之前阅读完笔记:)

I ran into a weird behavior from Oracle this morning... And I can't understand why it acts this way from the docs. I'm sorry for the long post, but I want to make sure I'm understood. Oh, and make sure to read the note at the end before answering :)

请求的目的是返回行中包含1个或多个小写字符的行.为了这个例子,我的表将是:

The goal of the request is to return rows with 1 or more lowercase characters. For the sake of the example, my table will be:

CREATE TABLE "TEMP_TABLE" 
   ( "VAL" VARCHAR2(4000 BYTE) );
Insert into TEMP_TABLE (VAL) values ('00A00');
Insert into TEMP_TABLE (VAL) values ('00000');
Insert into TEMP_TABLE (VAL) values ('BC000');
Insert into TEMP_TABLE (VAL) values ('ABC00');
Insert into TEMP_TABLE (VAL) values ('AAAAA');
Insert into TEMP_TABLE (VAL) values ('abc00');

使用此SQL请求:

select val, 
case when regexp_like (val, '[a-b]') then 'MATCH' else 'NO' end from temp_table;

如果会话的 NLS_SORT 值设置为BINARY,则oracle返回:

If the NLS_SORT value of the session is set to BINARY, oracle returns:

00A00   NO
00000   NO
BC000   NO
ABC00   NO
AAAAA   NO
abc00   MATCH

=>一切都很好:唯一包含小写字母的单词匹配;其他人没有.

=>All good here: the only word containing a lowercase letter matches; the others don't.

但如果将 NLS_SORT 设置为FRENCH,则结果难以理解:

but if NLS_SORT is set to FRENCH, the results are less understandable:

00A00   NO
00000   NO
BC000   MATCH
ABC00   MATCH
AAAAA   NO
abc00   MATCH

据我所知,当有个非A的字符时,正则表达式匹配.

From what I can deduce, the regexp matches when there are characters other than A.

所以我的问题是:为什么Oracle会把[a-z]理解为字母不是A的行"?

So my question is: Why would Oracle understand [a-z] as 'rows with letters that are not A' ?

注意1:规格: 该数据库位于Oracle 10G(r2)下,会话的NLS参数如下:

Note 1: Specs: The database is under Oracle 10G(r2), and the Session's NLS Parameters are as follow:

NLS_CALENDAR    GREGORIAN
NLS_COMP        BINARY
NLS_CURRENCY    ¿
NLS_DATE_FORMAT DD/MM/RR HH24:MI
NLS_DATE_LANGUAGE   FRENCH
NLS_DUAL_CURRENCY   ¿
NLS_ISO_CURRENCY    FRANCE
NLS_LANGUAGE    FRENCH
NLS_LENGTH_SEMANTICS    BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_NUMERIC_CHARACTERS  , 
NLS_SORT    FRENCH_M
NLS_TERRITORY   FRANCE
NLS_TIME_FORMAT HH24:MI:SSXFF
NLS_TIMESTAMP_FORMAT    DD/MM/RR HH24:MI:SSXFF
NLS_TIMESTAMP_TZ_FORMAT DD/MM/RR HH24:MI:SSXFF TZR
NLS_TIME_TZ_FORMAT  HH24:MI:SSXFF TZR

注意2:是的,我可以使用regexp_like(val,'[[:lower:]]').但是后来我发现了这一点,但并不能解释奇怪的行为.

Note 2 : Yes, I could use regexp_like(val, '[[:lower:]]') . But I found out about this later on, and it doesn't explain the weird behaviour.

推荐答案

不管是好是坏,nls_sort定义的排序顺序都用于评估[a-z]正则表达式.如果将a,b,c,A,B和C插入temp_table并在每种设置下对其进行排序,则会得到以下内容:

For better or worse, the sort ordering defined by nls_sort is being used to evaluate the [a-z] regexp. If you insert a,b,c,A,B, and C into temp_table and sort it under each setting you'll get the following:

SQL> alter session set nls_sort=BINARY;

Session altered.

SQL> select val,
  2  case when regexp_like (val, '[a-z]') then 'MATCH' else 'NO' end m
  3  from temp_table order by val;

VAL           M
------------------------- -------------------------
A             NO
B             NO
C             NO
a             MATCH
b             MATCH
c             MATCH

6 rows selected.

SQL> alter session set nls_sort=FRENCH;

Session altered.

SQL> select val,
  2  case when regexp_like (val, '[a-z]') then 'MATCH' else 'NO' end m
  3  from temp_table order by val;

VAL           M
------------------------- -------------------------
A             NO
a             MATCH
B             MATCH
b             MATCH
C             MATCH
c             MATCH

6 rows selected.

由于在法语设置中大写字母与小写字母是交织"的,因此在Oracle的实现中其评估结果为true.

Since the upper case letters are "interleaved" with the lower case letters in the French setting it evaluates to true in Oracle's implementation.

这篇关于Oracle使用NLS_SORT和简单的regexp_like的令人困惑的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆