有效 32 位有符号整数的正则表达式 [英] Regex for a valid 32-bit signed integer

查看:51
本文介绍了有效 32 位有符号整数的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很确定这个网站实际上还没有得到回答.一劳永逸,与 32 位有符号整数范围内的数字字符串匹配的最小正则表达式是什么,范围在 -21474836482147483647.

必须使用正则表达式进行验证 - 这是我唯一可用的选项.

我试过了

d{1,10}

但我不知道如何将其限制在有效的数字范围内.

<小时>

为了帮助开发正则表达式,它应该匹配:

-2147483648-2099999999-999999999-10199999999920999999992147483647

它不应该匹配:

-2147483649-2200000000-111111111112147483648220000000011111111111

我已经建立了一个在线现场演示(在rubular上),它有我的尝试和上面的测试用例.

<小时>

注意:将接受有效的最短正则表达式.将不考虑正则表达式的效率(除非最短长度并列).

解决方案

我真的希望这只是一个难题,没有人会在现实世界中使用正则表达式解决这个问题.正确的解决方案是将数字从字符串转换为数字类型,如 BigInteger.这应该允许我们使用适当的方法或运算符检查其范围,例如 compareTo><.<小时>

为了让生活更轻松,您可以使用此页面(死链接) 为范围生成正则表达式.所以范围 0 - 2147483647 的正则表达式看起来像

([0-9]{1,9}|1[0-9]{9}|2(0[0-9]{8}|1([0-3][0-9]{7}|4([0-6][0-9]{6}|7([0-3][0-9]{5}|4([0-7][0-9]{4}|8([0-2][0-9]{3}|3([0-5][0-9]{2}|6([0-3][0-9]|4[0-7]))))))))))

(更友好的方式)

([0-9]{1,9}|1[0-9]{9}|2(0[0-9]{8}|1([0-3][0-9]{7}|4([0-6][0-9]{6}|7([0-3][0-9]{5}|4([0-7][0-9]{4}|8([0-2][0-9]{3}|3([0-5][0-9]{2}|6([0-3][0-9]|4[0-7])))))))))))

和范围 0 - 2147483648

([0-9]{1,9}|1[0-9]{9}|2(0[0-9]{8}|1([0-3][0-9]{7}|4([0-6][0-9]{6}|7([0-3][0-9]{5}|4([0-7][0-9]{4}|8([0-2][0-9]{3}|3([0-5][0-9]{2}|6([0-3][0-9]|4[0-8]))))))))))

所以我们可以组合这些范围并将其写为

0-2147483647的范围 OR "-" 0-2147483648的范围

这会给我们

([0-9]{1,9}|1[0-9]{9}|2(0[0-9]{8}|1([0-3][0-9]{7}|4([0-6][0-9]{6}|7([0-3][0-9]{5}|4([0-7][0-9]{4}|8([0-2][0-9]{3}|3([0-5][0-9]{2}|6([0-3][0-9]|4[0-7])))))))))|-([0-9]{1,9}|1[0-9]{9}|2(0[0-9]{8}|1([0-3][0-9]{7}|4([0-6][0-9]{6}|7([0-3][0-9]{5}|4([0-7][0-9]{4}|8([0-2][0-9]{3}|3([0-5][0-9]{2}|6([0-3][0-9]|4[0-8]))))))))).

因为 Bohemian 在他的评论中注意到最终的正则表达式可以采用 -?regex1|-2147483648 的形式,所以这里是更短的版本(也将 [0-9] 更改为d)

^-?(d{1,9}|1d{9}|2(0d{8}|1([0-3]d{7}|4([0-6]d{6}|7([0-3]d{5}|4([0-7]d{4}|8([0-2]d{3}|3([0-5]d{2}|6([0-3]d|4[0-7])))))))))$|^-2147483648$

如果您将在 Java String#matches(regex) 方法的每一行中使用它,您还可以跳过 ^$ 部分,因为它们将被自动添加以确保整个字符串与正则表达式匹配.

我知道这个正则表达式非常难看,但只是说明了为什么正则表达式不是用于范围验证的好工具.

I'm pretty sure this hasn't actually been answered yet on this site. For once and for all, what is the smallest regex that matches a numeric string that is in the range of a 32-bit signed integer, in the range -2147483648 to 2147483647.

I must use regex for validation - that is the only option available to me.

I have tried

d{1,10}

but I can't figure out how to restrict it to the valid number range.


To aid developing in regex, it should match:

-2147483648
-2099999999
-999999999
-1
0
1
999999999
2099999999
2147483647

It should not match:

-2147483649
-2200000000
-11111111111
2147483648
2200000000
11111111111

I have set up an on-line live demo (on rubular) that has my attempt and the test cases above.


Note: The shortest regex that works will be accepted. Efficiency of regex will not be considered (unless there's a tie for shortest length).

解决方案

I really hope it is just puzzler and no one will use regex for this problem in real world. Proper solution would be converting number from string to numeric type like BigInteger. This should allow us to check its range using proper methods or operators, like compareTo, >, <.


To make life easier you can use this page (dead link) to generate regex for ranges. So regex for range 0 - 2147483647 can look like

([0-9]{1,9}|1[0-9]{9}|2(0[0-9]{8}|1([0-3][0-9]{7}|4([0-6][0-9]{6}|7([0-3][0-9]{5}|4([0-7][0-9]{4}|8([0-2][0-9]{3}|3([0-5][0-9]{2}|6([0-3][0-9]|4[0-7])))))))))

(friendlier way)

(
 [0-9]{1,9}|
1[0-9]{9}|
2(0[0-9]{8}|
  1([0-3][0-9]{7}|
       4([0-6][0-9]{6}|
            7([0-3][0-9]{5}|
                 4([0-7][0-9]{4}|
                      8([0-2][0-9]{3}|
                           3([0-5][0-9]{2}|
                                6([0-3][0-9]|
                                     4[0-7]
)))))))))

and range 0 - 2147483648

([0-9]{1,9}|1[0-9]{9}|2(0[0-9]{8}|1([0-3][0-9]{7}|4([0-6][0-9]{6}|7([0-3][0-9]{5}|4([0-7][0-9]{4}|8([0-2][0-9]{3}|3([0-5][0-9]{2}|6([0-3][0-9]|4[0-8])))))))))

So we can just combine these ranges and write it as

range of 0-2147483647 OR "-" range of 0-2147483648

which will give us

([0-9]{1,9}|1[0-9]{9}|2(0[0-9]{8}|1([0-3][0-9]{7}|4([0-6][0-9]{6}|7([0-3][0-9]{5}|4([0-7][0-9]{4}|8([0-2][0-9]{3}|3([0-5][0-9]{2}|6([0-3][0-9]|4[0-7])))))))))|-([0-9]{1,9}|1[0-9]{9}|2(0[0-9]{8}|1([0-3][0-9]{7}|4([0-6][0-9]{6}|7([0-3][0-9]{5}|4([0-7][0-9]{4}|8([0-2][0-9]{3}|3([0-5][0-9]{2}|6([0-3][0-9]|4[0-8]))))))))).

[edit]

Since Bohemian noticed in his comment final regex can be in form -?regex1|-2147483648 so here is little shorter version (also changed [0-9] to d)

^-?(d{1,9}|1d{9}|2(0d{8}|1([0-3]d{7}|4([0-6]d{6}|7([0-3]d{5}|4([0-7]d{4}|8([0-2]d{3}|3([0-5]d{2}|6([0-3]d|4[0-7])))))))))$|^-2147483648$

If you will use it in Java String#matches(regex) method on each line you can also skip ^ and $ parts since they will be added automatically to make sure entire string matches regex.

I know this regex is very ugly, but just shows why regex is not good tool for range validation.

这篇关于有效 32 位有符号整数的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆