与PostgreSQL奇怪的排序规则 [英] Strange collation with postgresql
问题描述
我注意到postgresql-9.5有一个奇怪的排序规则问题,因为它给Python脚本提供了不同的输出。据我了解,排序时通常是从左到右一次比较一个字符:
I noticed a strange collation issue with postgresql-9.5 as it was giving different output to a Python script. As I understand it, normally characters are compared one at a time from left to right when sorting:
select 'ab' < 'ac';
t
select 'abX' < 'ac';
t
因此,如果在上面的左手字符串中添加 X,则无关紧要。
So it's irrelevant if you add the 'X' to the left hand string above.
因此,当它不能用于空格和破折号之间的比较时,我感到很惊讶:
So I was surprised when this does not hold for comparison between a space and a dash:
select 'a ' < 'a-';
t
select 'a X' < 'a-';
f
这是一个错误还是有解决办法?我希望上面的最后一条语句也返回true。
Is it a bug or is there any way around this; I want the last statement above to also return true.
[接受@ laurenz-albe的回答后编辑]
show lc_collate;
en_US.UTF-8
如果我在Python中按Unicode排序,我会得到类似的结果因此,这不是Postgresql中的错误,而是我认为Unicode排序规范中的错误!
If I do sorting by unicode in Python I get similar results so it's not a bug in postgresql, but a bug in the unicode collation spec in my opinion!:
>>> import locale; locale.setlocale(locale.LC_ALL, "")
'en_US.UTF-8'
>>> l = ['ac', 'ab']; sorted(l) == sorted(l, cmp=locale.strcoll)
True
>>> l = ['ac', 'abX']; sorted(l) == sorted(l, cmp=locale.strcoll)
True
>>> l = ['ac', 'abX']; sorted(l) == sorted(l, cmp=locale.strcoll)
True
>>> l = ['a-', 'a X']; sorted(l) == sorted(l, cmp=locale.strcoll)
False
>>> sorted(l), sorted(l, cmp=locale.strcoll)
(['a X', 'a-'], ['a-', 'a X'])
推荐答案
使用归类比较字符。您可以使用SQL命令
Characters are compared using a collation. You can use the SQL command
SHOW lc_collate;
看看您的默认排序规则是什么。
to see what your default collation is.
PostgreSQL使用操作系统的归类,因此比较的结果通常取决于操作系统。
PostgreSQL uses the operating system's collations, so the result of a comparison will often depend on the operating system.
要强制按字节ASCII比较,请使用C归类
To force byte-wise ASCII comparison, use the C collation
test=> SELECT 'a X' COLLATE "C" < 'a-';
?column?
----------
t
(1 row)
或按字节比较的运算符
test=> SELECT 'a X' ~<~ 'a-';
?column?
----------
t
(1 row)
这篇关于与PostgreSQL奇怪的排序规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!