在带有Postgresql 9.6的Django中,如何区分大小写和不区分重音? [英] In Django with Postgresql 9.6 how to sort case and accent insensitive?
问题描述
我想要的就是在MySQL中使用 utf8_unicode_ci
。因此,如果我有这些字符串(Postgresql的默认排序顺序):
What I would like is the equivalent of using utf8_unicode_ci
in MySQL. So if I have those strings (default sort order with Postgresql):
- Barn
- Bubble
- Bœuf
- 牛肉
- boulette
- bémol
- Barn
- Bubble
- Bœuf
- beef
- boulette
- bémol
我希望它们能像这样排序(与MySQL中的 utf8_unicode_ci
一样):
I wish they would be sorted like this (as with utf8_unicode_ci
in MySQL):
- 谷仓
- 牛肉
- bémol
- Bœuf
- boulette
- 气泡
- Barn
- beef
- bémol
- Bœuf
- boulette
- Bubble
这种类型不区分大小写,不区分重音,连字被转换为多个字符。
This kind of sort is case insensitive, accent insensitive and ligatures are converted to multiple characters.
我知道不重音
和 lower
在Postgresql中,但我不知道如何从Django使用它们。
I know about unaccent
and lower
in Postgresql but I have no idea how to use them from Django.
可能Django / Postgresql解决方案:
Possible solutions with Django/Postgresql:
- 仅添加新列以对数据进行归一化处理(较低,不突出)。
- 添加索引(例如类似于此答案),但是我'不知道它将如何在Django上运行?
- Add new column only for sorting with data normalized (lower, unaccent).
- Add an index (like in this answer), but I'm not sure how it will work with Django?
我不认为全文搜索或Trigram可以在这里帮助我,因为我'不一定要根据文本进行搜索,但我需要获得良好的排序顺序。
I don't think Full Text Search or Trigram could help me here because I'm not necessarily doing searches base on text but I need to get the good sort order.
理想的查询应该很快,因此使用另一个索引列看起来是个好方法。但是我希望找到一个不需要为数据库中每个现有文本列实现的解决方案,该解决方案易于维护等。是否有最佳实践呢?
Ideally queries should be fast so using another indexed column looks like a good avenue. But I wish to find a solution that I don't need to implement for every exisiting text column in my DB, that is easy to maintain, etc. Is there a best practice to do that?
推荐答案
它与Django本身无关,PostgreSQL的 lc_collate
配置确定这一点。我建议您检查其值:
It isn't related to Django itself, PostgreSQL's lc_collate
configuration determines this. I'd suggest you to review its value:
SHOW lc_collate;
正确的做法是修复此配置。别忘了看一下相关的设置( lc_ctype
等)。
The right thing to do is fix this configuration. Don't forget to take a look on related settings too (lc_ctype
, etc.).
但是如果您无法使用正确的设置创建另一个数据库,请尝试在 ORDER
上显式地显示 collate
,如下例所示:
But if you cannot create another database with the right setting, try to explicit collate
on ORDER
like the following test case:
CREATE TEMPORARY TABLE table1 (column1 TEXT);
INSERT INTO table1 VALUES('Barn'),
('beef'),
('bémol'),
('Bœuf'),
('boulette'),
('Bubble');
SELECT * FROM table1 ORDER BY column1 COLLATE "en_US"; --Gives the expected order
SELECT * FROM table1 ORDER BY column1 COLLATE "C"; --Gives "wrong" order (in your case)
请务必记住PostgreSQL依赖于操作系统区域设置。此测试用例已在CentOS 7上执行。更多信息此处和< a href = https://stackoverflow.com/a/36423340/7925366>此处。
It's important to remember that PostgreSQL relies on operating system locales. This test case was executed on CentOS 7. More info here and here.
这篇关于在带有Postgresql 9.6的Django中,如何区分大小写和不区分重音?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!