我应该使用 d 还是 [0-9] 来匹配 Perl 正则表达式中的数字? [英] Should I use d or [0-9] to match digits in a Perl regex?
问题描述
在过去几周阅读了许多问题/答案后,我看到在 perl 正则表达式中使用 d
被评论为不正确.因为在 perl 的更高版本中 d
与 [0-9]
不同,因为 d
将代表任何 Unicode 字符具有数字属性,其中[0-9]
代表字符'0'、'1'、'2'、...、'9'.
Having read a number of questions/answers over the past few weeks, I have seen the use of d
in perl regular expressions commented on as incorrect. As in the later versions of perl d
is not the same as [0-9]
, as d
will represent any Unicode character that has the digit attribute, and that [0-9]
represents the characters '0', '1', '2', ..., '9'.
我很欣赏在某些情况下 [0-9]
将是正确使用的东西,而在其他情况下 d
将是.我想知道哪些人认为使用的默认值是正确的?
I appreciate that in some contexts [0-9]
will be the correct thing to use, and in others d
will be. I was wondering which people feel is the correct default to use?
我个人认为 d
符号非常简洁和富有表现力,而相比之下 [0-9]
有点麻烦.但我几乎没有编写多语言代码的经验,或者更确切地说是为不适合 ASCII 字符范围的语言编写代码,因此可能很幼稚.
Personally I find the d
notation very succinct and expressive, whereas in comparison [0-9]
is somewhat cumbersome. But I have little experience of doing multi-language code, or rather code for languages that do not fit into the ASCII character range, and therefore may be being naive.
我注意到
$find /System/Library/Perl/5.8.8/ -name *pm | xargs grep '\d' | wc -l
298
$find /System/Library/Perl/5.8.8/ -name *pm | xargs grep '[0-9]' | wc -l
26
推荐答案
为了最大的安全性,我建议在您不打算匹配所有 unicode 的任何时候使用 [0-9]
-定义的数字.
For maximum safety, I'd suggest using [0-9]
any time you don't specifically intend to match all unicode-defined digits.
Per perldoc perluniintro,Perl 不支持使用除[0-9]
作为数字,所以我肯定会使用 [0-9]
如果以下都正确:
Per perldoc perluniintro, Perl does not support using digits other than [0-9]
as numbers, so I would definitely use [0-9]
if the following are both true:
您希望将结果用作数字(例如对其执行数学运算或将其存储在仅接受正确数字的位置(例如数据库中的 INT 列)).
You want to use the result as a number (such as performing mathematical operations on it or storing it somewhere that only accepts proper numbers (e.g. an INT column in a database)).
非数字 [^0-9]
可能会以正则表达式可以匹配它们的方式出现在数据中.(请注意,对于不受信任/敌意的输入,应该总是认为这是正确的.)
It is possible non-digits [^0-9]
would be present in the data in such a way that the regular expression could match them. (Note that this one should always be considered true for untrusted/hostile input.)
如果其中任何一个是错误的,则很少有理由专门不使用 d
(您可能会知道什么时候是case),如果您尝试匹配所有 unicode 定义的数字,您肯定会想要使用 d
.
If either of these are false, there will only rarely be reason to specifically not use d
(and you'll probably be able to tell when that is the case), and if you're trying to match all unicode-defined digits, you'll definitely want to use d
.
这篇关于我应该使用 d 还是 [0-9] 来匹配 Perl 正则表达式中的数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!