我应该使用 d 还是 [0-9] 来匹配 Perl 正则表达式中的数字? [英] Should I use d or [0-9] to match digits in a Perl regex?

查看:18
本文介绍了我应该使用 d 还是 [0-9] 来匹配 Perl 正则表达式中的数字?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在过去几周阅读了许多问题/答案后,我看到在 perl 正则表达式中使用 d 被评论为不正确.因为在 perl 的更高版本中 d[0-9] 不同,因为 d 将代表任何 Unicode 字符具有数字属性,其中[0-9]代表字符'0'、'1'、'2'、...、'9'.

Having read a number of questions/answers over the past few weeks, I have seen the use of d in perl regular expressions commented on as incorrect. As in the later versions of perl d is not the same as [0-9], as d will represent any Unicode character that has the digit attribute, and that [0-9] represents the characters '0', '1', '2', ..., '9'.

我很欣赏在某些情况下 [0-9] 将是正确使用的东西,而在其他情况下 d 将是.我想知道哪些人认为使用的默认值是正确的?

I appreciate that in some contexts [0-9] will be the correct thing to use, and in others d will be. I was wondering which people feel is the correct default to use?

我个人认为 d 符号非常简洁和富有表现力,而相比之下 [0-9] 有点麻烦.但我几乎没有编写多语言代码的经验,或者更确切地说是为不适合 ASCII 字符范围的语言编写代码,因此可能很幼稚.

Personally I find the d notation very succinct and expressive, whereas in comparison [0-9] is somewhat cumbersome. But I have little experience of doing multi-language code, or rather code for languages that do not fit into the ASCII character range, and therefore may be being naive.

我注意到

$find /System/Library/Perl/5.8.8/ -name *pm | xargs grep '\d' | wc -l
  298
$find /System/Library/Perl/5.8.8/ -name *pm | xargs grep '[0-9]' | wc -l
  26

推荐答案

为了最大的安全性,我建议在您不打算匹配所有 unicode 的任何时候使用 [0-9]-定义的数字.

For maximum safety, I'd suggest using [0-9] any time you don't specifically intend to match all unicode-defined digits.

Per perldoc perluniintro,Perl 不支持使用除[0-9] 作为数字,所以我肯定会使用 [0-9] 如果以下都正确:

Per perldoc perluniintro, Perl does not support using digits other than [0-9] as numbers, so I would definitely use [0-9] if the following are both true:

  1. 您希望将结果用作数字(例如对其执行数学运算或将其存储在仅接受正确数字的位置(例如数据库中的 INT 列)).

  1. You want to use the result as a number (such as performing mathematical operations on it or storing it somewhere that only accepts proper numbers (e.g. an INT column in a database)).

非数字 [^0-9] 可能会以正则表达式可以匹配它们的方式出现在数据中.(请注意,对于不受信任/敌意的输入,应该总是认为这是正确的.)

It is possible non-digits [^0-9] would be present in the data in such a way that the regular expression could match them. (Note that this one should always be considered true for untrusted/hostile input.)

如果其中任何一个是错误的,则很少有理由专门使用 d(您可能会知道什么时候是case),如果您尝试匹配所有 unicode 定义的数字,您肯定会想要使用 d.

If either of these are false, there will only rarely be reason to specifically not use d (and you'll probably be able to tell when that is the case), and if you're trying to match all unicode-defined digits, you'll definitely want to use d.

这篇关于我应该使用 d 还是 [0-9] 来匹配 Perl 正则表达式中的数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆