JavaScript正则表达式字符串匹配/替换 [英] JavaScript Regular Expression String Match/Replace
问题描述
给出字符串; {abc} Lorem ipsum {/ abc} {a} dolor {/ a}
Given the string; "{abc}Lorem ipsum{/abc} {a}dolor{/a}"
我希望能够找到大括号标签的出现,存储标记和找到它的索引,并将其从原始字符串中删除。我想为每次出现重复这个过程,但是因为每次索引必须正确时我都会删除部分字符串...我找不到所有的索引然后在最后删除它们。对于上面的例子,应该发生的是;
I want to be able find occurrences of curly brace "tags", store the tag and the index where it was found and remove it from the original string. I want to repeat this process for each occurrence, but because I'm removing part of the string each time the index must be correct...I can't find all the indices THEN remove them at the end. For the example above, what should happen is;
- 搜索字符串......
- 查找{abc}在索引0
- 将{tag:{abc},index:0}推入数组
- 删除{ abc}from string
- 重复步骤1直到找不到更多匹配项
- Search the string...
- Find "{abc}" at index 0
- Push { tag: "{abc}", index: 0 } into an array
- Delete "{abc}" from string
- Repeat step 1 until no more matches can be found
给定这个逻辑,{/ abc}应该在索引11找到 - 因为{abc}已被删除。
Given this logic, "{/abc}" should be found at index 11 - since "{abc}" has already been removed.
我基本上需要知道那些标签开始和结束而不实际将它们作为字符串的一部分。
I basically need to know where those "tags" start and end without actually having them as part of the string.
我几乎使用正则表达式但它有时会跳过出现的事件。
I'm almost there using regular expressions but it sometimes skips occurrences.
let BETWEEN_CURLYS = /{.*?}/g;
let text = '{abc}Lorem ipsum{/abc} {a}dolor{/a}';
let match = BETWEEN_CURLYS.exec(text);
let tags = [];
while (match !== null) {
tags.push(match);
text = text.replace(match[0], '');
match = BETWEEN_CURLYS.exec(text);
}
console.log(text); // should be; Lorem ipsum dolor
console.log(tags);
/**
* almost there...but misses '{a}'
* [ '{abc}', index: 0, input: '{abc}Lorem ipsum{/abc} {a}dolor{/a}' ]
* [ '{/abc}', index: 11, input: 'Lorem ipsum{/abc} {a}dolor{/a}' ]
* [ '{/a}', index: 20, input: 'Lorem ipsum {a}dolor{/a}' ]
*/
推荐答案
你需要从正则表达式 lastIndex
中减去匹配长度,否则下一次迭代开始更远比预期(因为输入变短,并且 lastIndex
在您调用替换
以删除<之后不会更改code> {...} substring):
You need to subtract the match length from the regex lastIndex
value, otherwise the next iteration starts farther than expected (since the input becomes shorter, and the lastIndex
does not get changed after you call replace
to remove the {...}
substring):
let BETWEEN_CURLYS = /{.*?}/g;
let text = '{abc}Lorem ipsum{/abc} {a}dolor{/a}';
let match = BETWEEN_CURLYS.exec(text);
let tags = [];
while (match !== null) {
tags.push(match);
text = text.replace(match[0], '');
BETWEEN_CURLYS.lastIndex = BETWEEN_CURLYS.lastIndex - match[0].length; // HERE
match = BETWEEN_CURLYS.exec(text);
}
console.log(text); // should be; Lorem ipsum dolor
console.log(tags);
更多 RegExp #exec
引用记住:
Some more RegExp#exec
reference to bear in mind:
如果你的正则表达式使用
g
flag,您可以多次使用exec()
方法在同一个字符串中查找连续匹配。执行此操作时,搜索从正则表达式 str 的子字符串开始。 docs / Web / JavaScript / Reference / Global_Objects / RegExp / lastIndexrel =nofollow noreferrer>lastIndex
property(test()
还将推进lastIndex
属性)。
If your regular expression uses the "
g
" flag, you can use theexec()
method multiple times to find successive matches in the same string. When you do so, the search starts at the substring ofstr
specified by the regular expression'slastIndex
property (test()
will also advance thelastIndex
property).
这篇关于JavaScript正则表达式字符串匹配/替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!