当列可以以字母开头或结尾时,使用MySQL对varchar列进行数字排序并强制转换为无符号 [英] Using MySQL sort varchar column numerically with cast as unsigned when the column can begin or end with letters

查看:87
本文介绍了当列可以以字母开头或结尾时,使用MySQL对varchar列进行数字排序并强制转换为无符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了一些我不确定如何处理的问题.我正在建立一个数据库来存储有关运动卡的信息,当我想查看某些运动卡时,我在进行一些排序时遇到了一些问题.

I've ran into something I'm not really sure how to handle here. I'm building a database to store information on sports cards, and I'm having a bit of an issue with some sorting when I want to see certain cards.

对于背景,每张卡(数据库中的行)都有有关年份的信息,设置卡的来源,卡上的玩家和卡号(比这更多的信息,但这是所有相关的信息).当我看到结果时,我希望按年份排序,然后设置,然后是玩家,然后是卡号.除了卡号以外的所有东西都可以正常工作,因为year只是一个整数,而set和player都是varchars,因此对它们进行排序很容易.但是,卡号是我遇到的一些问题.

For background, each card (row in the database) has information on year, set the card is from, player on the card, and card number (there's more info than that, but this is all that's relevant here). When I see results, I want things to be sorted by year, then set, then player, then card number. Everything but card number is working fine, as year is just an integer, and set and player are both varchars, so it's easy to sort those. However, the card number is what I'm running into some issues with.

卡号列为varchar,因为卡号可以包含字母,数字和破折号.最常见的是,卡号是一个正数(即1、2、3、4、5),一个正数字母(Ex-A,Ex-B,Ex-C),一个数字后跟一个字母(1a,1b) ,2、3a,3b,3c)或字母后跟数字(A1,A2,A3,A4,A5).这是我目前设置SQL字符串的排序部分的方式:

The card number column is a varchar since the card number can include letters, numbers, and dashes. Most commonly, a card number will be a straight number (i.e. 1, 2, 3, 4, 5), straight letters (Ex-A, Ex-B, Ex-C), a number followed by a letter (1a, 1b, 2, 3a, 3b, 3c), or a letter followed by a number (A1, A2, A3, A4, A5). This is how I currently have the sort portion of my SQL string set up:

order by year desc, cardset asc, subset asc, cast(cardNum as unsigned) asc;

这可以处理大多数问题.但是我遇到的问题是,当一组卡的卡号中具有相同的字母,然后又有一个数字时.我希望排序基本上忽略前导字母,然后按数字排序.但是,有时它不能正确执行此操作,特别是当要分类的卡片超过5张时.

This is handling MOST things fine. But what I'm having issues with is when a group of cards have the same letters in their card number, and then have a number. I want the sort to essentially ignore the leading letters and then just sort by the numbers. But, sometimes it doesn't do this correctly, particularly, when there's more than 5ish cards with this to sort.

具体地说,它错误地将具有以下卡号的某些卡按以下顺序排序:

Specifically, it's incorrectly sorting some cards with the following card numbers into the following order:

  • BCP61
  • BCP97
  • BCP32
  • BCP135

何时应导致:

  • BCP32
  • BCP61
  • BCP97
  • BCP135

当前正在排序正数或数字,然后正确地跟字母(即1、2、3、4、5或1、2、3a,3b,4、5a,5b).我不知道它有没有正确地对直字母进行排序的问题,但是我目前对此的测试用例还很少,所以我不确定它是否为100%.

It is currently sorting straight numbers, or numbers followed by letters correctly (i.e. 1, 2, 3, 4, 5, or 1, 2, 3a, 3b, 4, 5a, 5b). I'm not aware of any issues with it sorting straight letters incorrectly, but I also currently have very few test cases of this, so I'm not sure if it's 100% or not.

除了不知道如何在不弄乱其他排序的情况下修改我的SQL排序语句之外,我真的不知道它是如何按照上述BCP示例的顺序排列的.关于如何纠正它的任何想法?我曾考虑过尝试忽略卡号中的字母,直到我们到达数字为止,但这会导致卡号中只有字母的卡出现重大问题.所以我有点卡住了.

In addition to not knowing how to modify my SQL sort statement without messing up other sorts, I don't really know how it's coming up with the order it is for the BCP example above. Any thoughts on how to correct it? I thought about trying to ignore letters in card number until we get to numbers, but that would cause major issues for cards with only letters in the card number. So I'm a bit stuck.

绝对最糟糕的是,我可能会将卡号列分为2个不同的列,一列用于更具描述性的部分,一列用于我要排序的部分.这样可能最终可以正常工作,但是需要很多工作才能使事情恢复原状!

Absolute worst comes to worst, I could probably split the card number column into 2 different columns, one for the part that is more descriptive, and one for the part I want to sort by. That would probably end up working just fine, but would require a lot of work to get things back!

编辑-这是一些更多信息,包括我数据库中的数据(对不起格式,在这里不知道如何处理表格):

Edit- Here is some more information including data in my DB (sorry for the formatting, no idea how to do tables here):

| year | cardSet | subset                     | cardNum |
| 2016 | Bowman  |                            | 52      |
| 2016 | Bowman  |                            | 54      |
| 2016 | Bowman  |                            | 147     |
| 2016 | Bowman  | Chrome Prospects           | BCP32   |
| 2016 | Bowman  | Chrome Prospects           | BCP61   |
| 2016 | Bowman  | Chrome Prospects           | BCP97   |
| 2016 | Bowman  | Chrome Prospects           | BCP135  |
| 2016 | Topps   |                            | 1       |
| 2016 | Topps   |                            | 2a      |
| 2016 | Topps   |                            | 2b      |
| 2016 | Topps   |                            | 3       |

我希望我的排序可以按以下顺序吐出结果:

I would expect my sort to spit out results in the following order:

  • 2016 Bowman 52
  • 2016 Bowman 54
  • 2016 Bowman 147
  • 2016 Bowman Chrome前景广阔BCP32
  • 2016 Bowman Chrome前景广阔BCP61
  • 2016 Bowman Chrome前景广阔BCP97
  • 2016 Bowman Chrome前景广阔BCP125
  • 2016 Topps 1
  • 2016 Topps 2a
  • 2016 Topps 2b
  • 2016 Topps 3
  • 2016 Bowman 52
  • 2016 Bowman 54
  • 2016 Bowman 147
  • 2016 Bowman Chrome Prospects BCP32
  • 2016 Bowman Chrome Prospects BCP61
  • 2016 Bowman Chrome Prospects BCP97
  • 2016 Bowman Chrome Prospects BCP125
  • 2016 Topps 1
  • 2016 Topps 2a
  • 2016 Topps 2b
  • 2016 Topps 3

但是,这是我上面的排序语句给出的结果:

However, here is the results I am given with my above sorting statement:

  • 2016 Bowman 52
  • 2016 Bowman 54
  • 2016 Bowman 147
  • 2016 Bowman Chrome前景广阔BCP62
  • 2016 Bowman Chrome前景广阔BCP97
  • 2016 Bowman Chrome前景广阔BCP32
  • 2016 Bowman Chrome前景广阔BCP125
  • 2016 Topps 1
  • 2016 Topps 2a
  • 2016 Topps 2b
  • 2016 Topps 3
  • 2016 Bowman 52
  • 2016 Bowman 54
  • 2016 Bowman 147
  • 2016 Bowman Chrome Prospects BCP62
  • 2016 Bowman Chrome Prospects BCP97
  • 2016 Bowman Chrome Prospects BCP32
  • 2016 Bowman Chrome Prospects BCP125
  • 2016 Topps 1
  • 2016 Topps 2a
  • 2016 Topps 2b
  • 2016 Topps 3

它只处理带有数字的卡号,或者处理数字后跟字母就很好了,但是当卡号以字母开头和后跟数字时,它会弄乱事情.

It handles card numbers with just numbers, or numbers followed by letters just fine, but it tends to mess things up when the card number begins with letters and is followed by numbers.

我尝试在注释中使用length()技巧,以使SQL的排序部分为:

I have tried using the length() trick in the comments so that the sort part of my SQL is:

order by year desc, cardset asc, subset asc, length(cardNum), cardNum asc

那确实解决了我在上面描述的问题,但是弄乱了我的示例中的"Topps"部分-无论如何,它将带有字母的卡片放在数字的最后.这是我得到的排序顺序:

That does fix the issue I was describing above, but messes up the 'Topps' part in my example- It'll put cards with letters following a number last no matter what. Here's the order I get with that sort:

  • 2016 Bowman 52
  • 2016 Bowman 54
  • 2016 Bowman 147
  • 2016 Bowman Chrome前景广阔BCP32
  • 2016 Bowman Chrome前景广阔BCP61
  • 2016 Bowman Chrome前景广阔BCP97
  • 2016 Bowman Chrome前景广阔BCP125
  • 2016 Topps 1
  • 2016 Topps 3
  • 2016 Topps 2a
  • 2016 Topps 2b
  • 2016 Bowman 52
  • 2016 Bowman 54
  • 2016 Bowman 147
  • 2016 Bowman Chrome Prospects BCP32
  • 2016 Bowman Chrome Prospects BCP61
  • 2016 Bowman Chrome Prospects BCP97
  • 2016 Bowman Chrome Prospects BCP125
  • 2016 Topps 1
  • 2016 Topps 3
  • 2016 Topps 2a
  • 2016 Topps 2b

推荐答案

MariaDB 10 MySQL 8 支持REGEXP_REPLACE.使用它,您可以定义一个自定义函数:

MariaDB 10 and MySQL 8 support REGEXP_REPLACE. Using that, you can define a custom function:

DROP FUNCTION IF EXISTS zerofill_numbers;
CREATE FUNCTION zerofill_numbers(str TEXT, len TINYINT)
    RETURNS text
    NO SQL
    DETERMINISTIC
    RETURN REGEXP_REPLACE(
        REGEXP_REPLACE(str, '(\\d+)', LPAD('\\1', len+2, 0)),
        REPLACE('0*(\\d{$len})', '$len', len),
        '\\1'
    );

现在提供以下测试数据:

Now given the following test data:

DROP TABLE IF EXISTS `strings`;
CREATE TABLE IF NOT EXISTS `strings` (`str` text);
INSERT INTO `strings` (`str`) VALUES
    ('Bowman 52'),
    ('Bowman 54'),
    ('Bowman 147'),
    ('Bowman Chrome Prospects BCP32'),
    ('Bowman Chrome Prospects BCP61'),
    ('Bowman Chrome Prospects BCP97'),
    ('Bowman Chrome Prospects BCP125'),
    ('Topps 1'),
    ('Topps 3'),
    ('Topps 2a'),
    ('Topps 2b'),
    ('v9.9.3'),
    ('v9.10.3'),
    ('v11.3.4'),
    ('v9.9.11'),
    ('v11.3'),
    ('0.9'),
    ('0.11'),
    ('s09'),
    ('s11'),
    ('s0'),
    ('v9.0.1');

我们可以使用以下方式对其进行排序:

We can sort it with:

SELECT s.str
FROM strings s
ORDER BY zerofill_numbers(s.str, 10)

这是结果:

0.9
0.11
Bowman 52
Bowman 54
Bowman 147
Bowman Chrome Prospects BCP32
Bowman Chrome Prospects BCP61
Bowman Chrome Prospects BCP97
Bowman Chrome Prospects BCP125
s0
s09
s11
Topps 1
Topps 2a
Topps 2b
Topps 3
v9.0.1
v9.9.3
v9.9.11
v9.10.3
v11.3
v11.3.4

该函数将使用零填充字符串中的任何数字,直到其具有定义的长度为止.

The function will fill any number in in the string with zeros until it has the defined length.

注1:这将无法正确排序十进制数字(请参阅0.90.11).您也不应尝试将其用于签名号码.

Note 1: This will not sort decimal numbers correctly (see 0.9 and 0.11). You should also not try to use it for signed numbers.

注释2:此功能基于以下答案: https://stackoverflow.com/a/46099386/5563083 -因此,如果您认为此答案有帮助,请继续并注明出处.

Note 2: This function is based on the following answer: https://stackoverflow.com/a/46099386/5563083 - So if you find this answer helpfull, go and upvote the source.

注意3::如果您不想定义自定义函数,则可以内联使用相同的方法:

Note 3: If you don't want to define a custom function, you can use the same method inline:

SELECT *
FROM strings
ORDER BY
  REGEXP_REPLACE(REGEXP_REPLACE(str, '(\\d+)', LPAD('\\1', 10+2, 0)), '0*(\\d{10})', '\\1')

这篇关于当列可以以字母开头或结尾时,使用MySQL对varchar列进行数字排序并强制转换为无符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆