正则表达式匹配重叠/交叉 [英] Regex match overlap/crossover

查看:41
本文介绍了正则表达式匹配重叠/交叉的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将某些文本中的首字母缩略词大写.

I need to capitalise acronyms in some text.

我目前有这个正则表达式来匹配首字母缩略词:

I currently have this regex to match on the acronyms:

/(^|[^a-z0-9])(ECU|HVAC|ABS|ESC|EGR|ADAS|HEV|HMI)($|[^a-z0-9])/ig

说明:这是为了匹配任何位于文本开头或结尾的首字母缩略词,或者它们两侧没有字母或数字(因为它们可能是单词的一部分)- 例如,我不想替换Escape"一词中的Esc").

Explanation: this is aiming to match any of the acronyms where they are either at the start or end of the text, or there isn't a letter or number either side of them (as then they might be part of a word - e.g. I wouldn't want to replace the "Esc" in the word "Escape").

这在大多数情况下都有效,但不适用于以下示例:

This works most of the time, but doesn't work for the following example:

"abs/esc"

它匹配abs,但不匹配esc.我猜这是因为匹配重叠,因为正斜杠是与 abs 相关的匹配的一部分.

It matches the abs, but not the esc. I'm guessing this is because the matches overlap, in that the forward slash is part of the match relating to abs.

谁能建议如何在两者上进行匹配?

Can anyone suggest how to get a match on both?

作为旁注,我使用 PHP 的 preg_replace_callback 来执行转换:

As a side note, I'm using PHPs preg_replace_callback to perform the transformation afterwards:

$name = 'abs/esc';
$name = preg_replace_callback('/(^|[^a-z0-9])('ECU|HVAC|ABS|ESC|EGR|ADAS|HEV|HMI')($|[^a-z0-9])/i', function($matches) {
    return $matches[1] . strtoupper($matches[2]) . $matches[3];
}, $name);

推荐答案

是的,原因是因为它重叠(匹配abs时,它也消耗了/.然后对于esc,它找不到[^a-z0-9],因为它扫描的下一个字母是e).

Yes the reason is because it overlaps (when matching the abs, it also consumes the /. Then for esc, it cannot find [^a-z0-9] because the next letter it is scanning is e).

你可以改用这个正则表达式:

You could use this RegEx instead:

\b(ECU|HVAC|ABS|ESC|EGR|ADAS|HEV|HMI)\b

\b 是一个 词边界,它不消耗任何字符,因此不会有重叠

\b is a Word Boundary, it does not consume any characters and therefore there will be no overlap

Regex101 现场演示

您还可以更改您的 RegEx 以使用 Positive Lookahead,因为这也不消耗字符:

You can also change your RegEx to use a Positive Lookahead, since this also does not consume characters:

(^|[^a-z0-9])(ECU|HVAC|ABS|ESC|EGR|ADAS|HEV|HMI)(?=$|[^a-z0-9])

Regex101 现场演示

这篇关于正则表达式匹配重叠/交叉的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆