PHP 正则表达式将字母数字字符串与一些(但不是全部)标点符号匹配 [英] PHP regular expression to match alpha-numeric strings with some (but not all) punctuation
问题描述
我在 PHP 中编写了一个正则表达式,以允许包含除 & 或 @ 之外的任何标点符号的字母数字字符串.从本质上讲,除了这两个字符之外,我需要允许标准美式键盘上的任何内容.我花了一段时间才想出以下正则表达式,这似乎正在做我需要的:
I've written a regular expression in PHP to allow strings that are alpha-numeric with any punctuation except & or @. Essentially, I need to allow anything on a standard American keyboard with the exception of those two characters. It took me a while to come up with the following regex, which seems to be doing what I need:
if (ereg("[^]A-Za-z0-9\[!\"#$%'()*+,./:;<=>?^_`{|}~\-]", $test_string)) {
// error message goes here
}
这让我想到了我的问题……有没有更好、更简单或更有效的方法?
Which brings me to my question... is there a better, simpler, or more efficient way?
推荐答案
看看字符范围:
@[!-%'-?A-~]+@
这将排除字符 &(\0x26)
和 @(0x40)
.查看 ASCII 表,您可以看到它是如何工作的:感叹号是 ASCII 集中的第一个字符,不是空格.然后它将匹配所有内容,包括 %
字符,它紧跟在 & 符号之前.然后是下一个范围,直到 @
字符,它位于 ?
和 A
之间.之后,我们将所有内容匹配到标准 ASCII 字符集的末尾,即 ~
.
This will exclude the characters & (\0x26)
and @ (0x40)
.
Looking at an ASCII Table,you can see how this works:
The exclamation mark is the first character in the ASCII set, that is not whitespace. It will then match everything up to and including the %
character, which immediately precedes the ampersand. Then the next range until the @
character, which lies between ?
and A
. After that, we match everything unto the end of the standard ASCII character set which is a ~
.
为了使内容更具可读性,您也可以考虑分两步执行此操作:首先,过滤掉默认 ASCII 范围之外的任何内容.
To make things more readable, you might also consider to do this in two steps: At first, filter anything outside of the default ASCII range.
@[!-~]+@
在第二步中,过滤不需要的字符,或者简单地对字符执行 str_pos
.
In a second step, filter your undesired characters, or simply do a str_pos
on the characters.
最后,您可以将它与您开始的内容进行比较,看看它是否包含任何不需要的字符.
At the end, you can compare it with what you started to see whether it contained any undesired characters.
相反,您也可以在第二步中使用这样的正则表达式./[^@&]+/
Instead, you could also use a regex such as this for the second step.
/[^@&]+/
这些步骤是可以互换的,首先在 @
或 &
上执行 str_pos 作为识别坏字符的第一步,可能会更好地提高性能.
The steps are interchangeable and doing a str_pos on @
or &
as a first step, to identify bad characters, may be better performance wise.
这篇关于PHP 正则表达式将字母数字字符串与一些(但不是全部)标点符号匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!