PostgreSQL中的字母数字排序 [英] Alphanumeric Sorting in PostgreSQL

查看:1095
本文介绍了PostgreSQL中的字母数字排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这张表中的Postgres 9.6中的个字符变化列:

I have this table with a character varying column in Postgres 9.6:

id | column 
------------
1  |IR ABC-1
2  |IR ABC-2
3  |IR ABC-10

我看到一些解决方案将列转换为 bytea

I see some solutions typecasting the column as bytea.

select * from table order by column::bytea.

但是它总是导致:

id | column 
------------
1  |IR ABC-1
2  |IR ABC-10
3  |IR ABC-2

我不知道为什么 10总是排在 2之前。假设排序的依据是字符串的最后一个整数,而不管该数字之前的字符是什么,那么我如何对该表进行排序。

I don't know why '10' always comes before '2'. How do I sort this table, assuming the basis for ordering is the last whole number of the string, regardless of what the character before that number is.

推荐答案

对字符数据类型进行排序的可能问题是 排序规则规则适用(除非您使用语言环境 C(默认情况下,默认情况下是按字节值对字符进行排序)。应用排序规则可能是可取的,也可能是不希望的。这会使排序成本更高在任何情况下,如果要不使用排序规则进行排序,请不要转换为 bytea ,请使用 COLLATE C 而是:

A possible problem with sorting character data types is that collation rules apply (unless you work with locale "C" (which simply defaults to sorting characters by there byte values). Applying collation rules may or may not be desirable. It makes sorting more expensive in any case. If you want to sort without collation rules, don't cast to bytea, use COLLATE "C" instead:

SELECT * FROM table ORDER BY column COLLATE "C";

,这还不能解决您提到的字符串中的数字问题。拆分字符串并将数字部分排序为数字。

However, this does not yet solve the problem with numbers in the string you mention. You have to split the string and sort the numeric part as number.

SELECT *
FROM   table
ORDER  BY split_part(column, '-', 2)::numeric;

或者,如果您的所有数字都适合 bigint 甚至整数,请改用它。

Or, if all your number fit into bigint or even integer, use that instead.

我忽略了开头部分,因为您这样写:

I ignored the leading part because you write:


...排序的依据是字符串的最后一个整数,无论​​该数字之前的字符是什么。

... the basis for ordering is the last whole number of the string, regardless of what the character before that number is.

相关:

  • Alphanumeric sorting with PostgreSQL
  • Split comma separated column data into additional columns
  • What is the impact of LC_CTYPE on a PostgreSQL database?

通常,最好将字符串的不同部分保存为单独的适当数据类型,以避免任何此类混淆。

Typically, it's best to save distinct parts of a string in separate columns as proper respective data types to avoid any such confusion.

如果 all 列的前导字符串相同,请考虑删除多余的噪声。您始终可以使用 VIEW 附加要显示的字符串。

And if the leading string is identical for all columns, consider just dropping the redundant noise. You can always use a VIEW to prepend a string for display.

这篇关于PostgreSQL中的字母数字排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆