如何使用JS正则表达式查找所有不匹配字符的索引? [英] How to find indexes of all non-matching characters with a JS regex?

查看:154
本文介绍了如何使用JS正则表达式查找所有不匹配字符的索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串,我想得到一个数组,其中包含该字符串中与某个正则表达式标准不匹配的字符的索引(位置)。

I've got a string and I want to get an array with the indexes (positions) of the characters in this string that do not match a certain regex criteria.

这里的问题是,如果我这样写:

The issue here is that if I write it like this:

let match;
let reg = /[A-Za-z]|[0-9]/g;
let str = "1111-253-asdasdas";
let indexes = [];

do {
    match = reg.exec(str);
    if (match) indexes.push(match.index);
} while (match);

它有效。它返回所有数字或字母字符的索引。但问题是,如果我试图反过来,在Regex中使用负面预测,就像这样:

It works. It returns the indexes of all the characters that are numerical or alphabetical. But the problem is that if I try to make the opposite, with a negative lookahead in Regex, like this:

let match;
let reg = /(?!([A-Za-z]|[0-9]))/g;
let str = "1111-253-asdasdas";
let indexes = [];

do {
    match = reg.exec(str);
    if (match) indexes.push(match.index);
} while (match);

它以无限循环结束。

我想要实现的结果与第一种情况相同,但是使用负正则表达式,所以在这种情况下结果将是:

What I'd like to achieve is the same result as in the first case, but with the negative regex, so in this case the result would be:

indexes = [4, 8]; // which are the indexes in which a non-alphanumerical character appears

循环是错误的,还是它的错误正则表达式那个弄乱的东西?也许 exec 无法使用负面前瞻Regex表达式?

Is the loop wrong, or it's the regex expression the one who is messing things up? Maybe the exec is not working with negative lookaheads Regex expressions?

我会理解正则表达式不能正常工作意图(因为它可能是错误格式化的),但我不理解无限循环,这让我认为 exec 可能不是实现目标的最佳途径我正在寻找。

I would understand the regex expression not working as I intended to (because it may be wrongly formatted), but I don't understand the infinite loop, which leads me to think that exec maybe is not the best way to achieve what I'm looking for.

推荐答案

原因

无限循环很容易解释:正则表达式有一个 g 修饰符,因此尝试匹配模式的多次出现,在每次匹配尝试结束后之前的成功匹配,即在 <之后code> lastIndex 值:

The infinite loop is easy to explain: the regex has a g modifier and thus tries to match multiple occurrences of the pattern starting each matching attempt after the end of the previous successful match, that is, after the lastIndex value:

参见 exec 文档


如果你的正则表达式使用 g 标志,你可以多次使用 exec()方法来查找相同的连续匹配串。执行此操作时,搜索从正则表达式 str 的子字符串开始。 docs / Web / JavaScript / Reference / Global_Objects / RegExp / lastIndexrel =nofollow noreferrer> lastIndex property

If your regular expression uses the "g" flag, you can use the exec() method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property

但是,由于您的模式匹配空字符串,并且如果索引等于 lastIndex ,则不检查条件,正则表达式不能以字符串形式前进。

However, since your pattern matches an empty string, and you do not check the condition if the index is equal to lastIndex, the regex cannot advance in a string.

解决方案

使用正则表达式匹配任何非字母数字字符, / [\ W _] / g 。由于它与空字符串不匹配,因此每次匹配都会更改RegExp对象的 lastIndex 属性,并且不会发生无限循环。

Use a regex to match any non-alphanumeric chars, /[\W_]/g. Since it does not match empty strings the lastIndex property of the RegExp object will be changed with each match and no infinite loop will occur.

JS演示:

let match, indexes = [];
let reg = /[\W_]/g;
let str = "1111-253-asdasdas";

while (match = reg.exec(str)) {
    indexes.push(match.index);
}
console.log(indexes);

另外,请参阅如何手动移动 lastIndex 属性值。

这篇关于如何使用JS正则表达式查找所有不匹配字符的索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆