我应该使用\ d或[0-9]来匹配Perl正则表达式中的数字吗? [英] Should I use \d or [0-9] to match digits in a Perl regex?

查看:92
本文介绍了我应该使用\ d或[0-9]来匹配Perl正则表达式中的数字吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在过去的几周中阅读了许多问题/答案后,我发现在perl正则表达式中使用\d的情况被评论为不正确.与在更高版本的perl中一样,\d[0-9]不同,因为\d将表示具有digit属性的任何Unicode字符,而[0-9]则表示字符'0','1', '2',...,'9'.

Having read a number of questions/answers over the past few weeks, I have seen the use of \d in perl regular expressions commented on as incorrect. As in the later versions of perl \d is not the same as [0-9], as \d will represent any Unicode character that has the digit attribute, and that [0-9] represents the characters '0', '1', '2', ..., '9'.

我很欣赏在某些情况下[0-9]将是正确的用法,而在另一些情况下\d将会是正确的.我想知道哪个人会认为是正确的默认设置?

I appreciate that in some contexts [0-9] will be the correct thing to use, and in others \d will be. I was wondering which people feel is the correct default to use?

我个人认为\d表示法非常简洁和富有表现力,而相比之下,[0-9]则比较麻烦.但是我几乎没有做多语言代码的经验,或者说对于不适合ASCII字符范围的语言的代码,因此可能很幼稚.

Personally I find the \d notation very succinct and expressive, whereas in comparison [0-9] is somewhat cumbersome. But I have little experience of doing multi-language code, or rather code for languages that do not fit into the ASCII character range, and therefore may be being naive.

我注意到

$find /System/Library/Perl/5.8.8/ -name \*pm | xargs grep '\\d' | wc -l
  298
$find /System/Library/Perl/5.8.8/ -name \*pm | xargs grep '\[0-9\]' | wc -l
  26

推荐答案

为了最大程度的安全,我建议您在不特别希望匹配所有unicode定义的数字时使用[0-9].

For maximum safety, I'd suggest using [0-9] any time you don't specifically intend to match all unicode-defined digits.

Per perldoc perluniintro ,Perl不支持使用[0-9]以外的数字作为数字,因此,如果同时满足以下两个条件,我肯定会使用[0-9]:

Per perldoc perluniintro, Perl does not support using digits other than [0-9] as numbers, so I would definitely use [0-9] if the following are both true:

  1. 您想将结果用作数字(例如对其执行数学运算或将其存储在仅接受适当数字的位置(例如数据库中的INT列)).

  1. You want to use the result as a number (such as performing mathematical operations on it or storing it somewhere that only accepts proper numbers (e.g. an INT column in a database)).

数据中可能会出现非数字[^0-9],以使正则表达式可以匹配它们. (请注意,对于不受信任/敌对的输入,应始终将其视为 .)

It is possible non-digits [^0-9] would be present in the data in such a way that the regular expression could match them. (Note that this one should always be considered true for untrusted/hostile input.)

如果其中任何一个为假,则几乎没有理由专门使用\d(并且您可能会知道何时是这种情况),以及're 尝试以匹配所有Unicode定义的数字,您肯定要使用\d.

If either of these are false, there will only rarely be reason to specifically not use \d (and you'll probably be able to tell when that is the case), and if you're trying to match all unicode-defined digits, you'll definitely want to use \d.

这篇关于我应该使用\ d或[0-9]来匹配Perl正则表达式中的数字吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆