使用交替或字符类进行单字符匹配? [英] Using alternation or character class for single character matching?

查看:33
本文介绍了使用交替或字符类进行单字符匹配?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(注意:标题似乎不清楚——如果有人可以改写这个,我完全赞成!)

(Note: Title doesn't seem to clear -- if someone can rephrase this I'm all for it!)

给定这个正则表达式:(.*_e.txt),它匹配一些文件名,除了 e 之外,我还需要添加一些其他的单字符后缀.我应该选择一个字符类还是应该为此使用替代?(或者真的重要吗??)

Given this regex: (.*_e.txt), which matches some filenames, I need to add some other single character suffixes in addition to the e. Should I choose a character class or should I use an alternation for this? (Or does it really matter??)

也就是说,以下哪两个看起来更好",为什么:

That is, which of the following two seems "better", and why:

a) (.*(e|f|x).txt),或

b) (.*[efx].txt)

推荐答案

使用 [efx] - 这正是字符类的设计目的:匹配包含的字符之一.因此它也是最易读、最短的解决方案.

Use [efx] - that's exactly what character classes are designed for: to match one of the included characters. Therefore it's also the most readable and shortest solution.

我不知道它是否更快,但如果不是,我会非常惊讶.绝对不会慢.

I don't know if it's faster, but I would be very much surprised if it wasn't. It definitely won't be slower.

我的推理(从来没有写过正则表达式引擎,所以这纯粹是猜想):

My reasoning (without ever having written a regex engine, so this is pure conjecture):

正则表达式标记 [abc] 将在正则表达式引擎的一个步骤中应用:下一个字符是 a, b,还是c?"

The regex token [abc] will be applied in a single step of the regex engine: "Is the next character one of a, b, or c?"

(a|b|c) 但是告诉正则表达式引擎

(a|b|c) however tells the regex engine to

  • 如有必要,记住字符串中的当前位置以进行回溯
  • 检查是否可以匹配a.如果是这样,成功.如果不是:
  • 检查是否可以匹配b.如果是这样,成功.如果不是:
  • 检查是否可以匹配c.如果是这样,成功.如果不是:
  • 放弃.
  • remember the current position in the string for backtracking, if necessary
  • check if it's possible to match a. If so, success. If not:
  • check if it's possible to match b. If so, success. If not:
  • check if it's possible to match c. If so, success. If not:
  • give up.

这篇关于使用交替或字符类进行单字符匹配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆