使用交替或字符类进行单字符匹配? [英] Using alternation or character class for single character matching?
问题描述
(注意:标题似乎不清楚——如果有人可以改写这个,我完全赞成!)
(Note: Title doesn't seem to clear -- if someone can rephrase this I'm all for it!)
给定这个正则表达式:(.*_e.txt)
,它匹配一些文件名,除了 e
之外,我还需要添加一些其他的单字符后缀.我应该选择一个字符类还是应该为此使用替代?(或者真的重要吗??)
Given this regex: (.*_e.txt)
, which matches some filenames, I need to add some other single character suffixes in addition to the e
. Should I choose a character class or should I use an alternation for this? (Or does it really matter??)
也就是说,以下哪两个看起来更好",为什么:
That is, which of the following two seems "better", and why:
a) (.*(e|f|x).txt)
,或
b) (.*[efx].txt)
推荐答案
使用 [efx]
- 这正是字符类的设计目的:匹配包含的字符之一.因此它也是最易读、最短的解决方案.
Use [efx]
- that's exactly what character classes are designed for: to match one of the included characters. Therefore it's also the most readable and shortest solution.
我不知道它是否更快,但如果不是,我会非常惊讶.绝对不会慢.
I don't know if it's faster, but I would be very much surprised if it wasn't. It definitely won't be slower.
我的推理(从来没有写过正则表达式引擎,所以这纯粹是猜想):
My reasoning (without ever having written a regex engine, so this is pure conjecture):
正则表达式标记 [abc]
将在正则表达式引擎的一个步骤中应用:下一个字符是 a
, b代码>,还是
c
?"
The regex token [abc]
will be applied in a single step of the regex engine: "Is the next character one of a
, b
, or c
?"
(a|b|c)
但是告诉正则表达式引擎
(a|b|c)
however tells the regex engine to
- 如有必要,记住字符串中的当前位置以进行回溯
- 检查是否可以匹配
a
.如果是这样,成功.如果不是: - 检查是否可以匹配
b
.如果是这样,成功.如果不是: - 检查是否可以匹配
c
.如果是这样,成功.如果不是: - 放弃.
- remember the current position in the string for backtracking, if necessary
- check if it's possible to match
a
. If so, success. If not: - check if it's possible to match
b
. If so, success. If not: - check if it's possible to match
c
. If so, success. If not: - give up.
这篇关于使用交替或字符类进行单字符匹配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!