在Postgres中与变音符号一起订购 [英] ORDER BY with diacritic in Postgres
问题描述
我需要从表中选择数据并使用ORDER BY子句对它们进行排序。
问题是该列包含带有捷克语变音符号的文本数据。
我无法使用COLLATE,因为数据库是postgres集群的一部分,该集群是使用lc_collate = en_US.UTF-8创建的,并且我无法承受因使用正确的lc_collate重新创建集群而导致的停机时间。
I need to select data from table and sort them with ORDER BY clause. The problem is the column contains text data with czech diacritic. I cannot use COLLATE, because the DB is part of postgres cluster which was created with lc_collate = en_US.UTF-8 and I cannot afford downtime caused by recreating the cluster with correct lc_collate.
样本数据:
CREATE TABLE test (
id serial PRIMARY key,
name text
);
INSERT INTO test (name) VALUES ('Žoo'), ('Zoo'), ('ŽOO'), ('ZOO'),
('ŽoA'), ('ŽóA'), ('ŽoÁ'), ('ŽóÁ');
理想的输出:
SELECT * FROM test ORDER BY name COLLATE "cs_CZ.utf8";
id | name
----+------
2 | Zoo
4 | ZOO
5 | ŽoA
7 | ŽoÁ
6 | ŽóA
8 | ŽóÁ
1 | Žoo
3 | ŽOO
(8 rows)
在这里我找到了一种解决方案:
Here I found kind of solution:
SELECT * FROM test ORDER BY name USING ~<~;
id | name
----+------
4 | ZOO
2 | Zoo
3 | ŽOO
5 | ŽoA
1 | Žoo
7 | ŽoÁ
6 | ŽóA
8 | ŽóÁ
(8 rows)
结果足够接近(适合我的用法)-
The result is close enough (for my usage) - the caroned letters are AFTER the non-caroned.
我的主题略微偏离主题的Postgresql分析与〜<〜
运算符
编辑:变成了新问题。
回到问题:除了使用正确的语言环境重新创建postgres集群以外,还有其他解决方案来获得理想的订单吗?
Back to the question: Is there other solution to get the ideal order besides recreating the postgres cluster with correct locale?
〜<〜
运算符会很好。
推荐答案
如@伊戈尔在评论中指出,无需使用不同的 lc_collate 重新创建postgres集群并处理造成的停机时间。
As @Igor pointed out in his comment, there is no need to recreate the postgres cluster with different lc_collate and deal with the caused downtime.
确切步骤解决问题的方法是:
Exact steps that solved the problem were:
-
添加/取消注释行
cs_CZ。
/etc/locale.gen
生成新的语言环境:
#locale-gen
在postgres中定义新的排序规则:
define new collation in postgres:
创建集合 cs_CZ.utf8(语言环境='cs_CZ.UTF-8' );
这篇关于在Postgres中与变音符号一起订购的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!