使用正则表达式在 JavaScript 中查找最长的重复子字符串 [英] Find longest repeating substring in JavaScript using regular expressions
问题描述
我想在字符串中找到最长的重复字符串,用 JavaScript 实现并使用基于正则表达式的方法.
我有一个 PHP 实现,当直接移植到 JavaScript 时,它不起作用.
PHP 实现取自对问题查找最长的重复字符串?":
preg_match_all('/(?=((.+)(?:.*?\2)+))/s', $input, $matches, PREG_SET_ORDER);
这将填充 $matches[0][X]
(其中 X
是 $matches[0]
的长度)在 $input
中找到的最长重复子串.我已经用许多输入字符串对此进行了测试,发现我确信输出是正确的.
JavaScript 中最近的直接端口是:
var 匹配 =/(?=((.+)(?:.*?\2)+))/.exec(input);
这不会给出正确的结果
<前>输入异常结果匹配[0][X]======================================================输入输入输入输入7inputinput 输入输入inputinput7 输入输入7输入输入7输入7XXinputinputYY 输入XX我对正则表达式不够熟悉,无法理解这里使用的正则表达式在做什么.
我当然可以实现一些算法来找到最长的重复子串.在我尝试这样做之前,我希望不同的正则表达式会在 JavaScript 中产生正确的结果.
可以修改上面的正则表达式,以便在 JavaScript 中返回预期的输出吗?我承认这在单线中可能是不可能的.
Javascript 匹配只返回第一个匹配 -- 你必须循环才能找到多个结果.一些测试表明这得到了预期的结果:
function maxRepeat(input) {var reg =/(?=((.+)(?:.*?\2)+))/g;var sub = "";//粘贴临时结果的地方var maxstr = "";//我们的最大长度重复字符串reg.lastIndex = 0;//因为 reg 以前存在,我们可能需要重置它sub = reg.exec(输入);//找到第一个重复的字符串while (!(sub == null)){if ((!(sub == null)) && (sub[2].length > maxstr.length)){maxstr = sub[2];}sub = reg.exec(输入);reg.lastIndex++;//从下一个位置开始搜索}返回最大字符串;}//为方便起见,我正在登录到控制台console.log(maxRepeat("aabcd"));//aaconsole.log(maxRepeat("inputinput"));//输入console.log(maxRepeat("7inputinput"));//输入console.log(maxRepeat("inputinput7"));//输入console.log(maxRepeat("7inputinput7"));//输入console.log(maxRepeat("xxabcdyy"));//Xconsole.log(maxRepeat("XXinputinputYY"));//输入
请注意,对于xxabcdyy",您只会返回x",因为它返回最大长度的第一个字符串.
I'd like to find the longest repeating string within a string, implemented in JavaScript and using a regular-expression based approach.
I have an PHP implementation that, when directly ported to JavaScript, doesn't work.
The PHP implementation is taken from an answer to the question "Find longest repeating strings?":
preg_match_all('/(?=((.+)(?:.*?\2)+))/s', $input, $matches, PREG_SET_ORDER);
This will populate $matches[0][X]
(where X
is the length of $matches[0]
) with the longest repeating substring to be found in $input
. I have tested this with many input strings and found am confident the output is correct.
The closest direct port in JavaScript is:
var matches = /(?=((.+)(?:.*?\2)+))/.exec(input);
This doesn't give correct results
input Excepted result matches[0][X] ====================================================== inputinput input input 7inputinput input input inputinput7 input input 7inputinput7 input 7 XXinputinputYY input XX
I'm not familiar enough with regular expressions to understand what the regular expression used here is doing.
There are certainly algorithms I could implement to find the longest repeating substring. Before I attempt to do that, I'm hoping a different regular expression will produce the correct results in JavaScript.
Can the above regular expression be modified such that the expected output is returned in JavaScript? I accept that this may not be possible in a one-liner.
Javascript matches only return the first match -- you have to loop in order to find multiple results. A little testing shows this gets the expected results:
function maxRepeat(input) {
var reg = /(?=((.+)(?:.*?\2)+))/g;
var sub = ""; //somewhere to stick temp results
var maxstr = ""; // our maximum length repeated string
reg.lastIndex = 0; // because reg previously existed, we may need to reset this
sub = reg.exec(input); // find the first repeated string
while (!(sub == null)){
if ((!(sub == null)) && (sub[2].length > maxstr.length)){
maxstr = sub[2];
}
sub = reg.exec(input);
reg.lastIndex++; // start searching from the next position
}
return maxstr;
}
// I'm logging to console for convenience
console.log(maxRepeat("aabcd")); //aa
console.log(maxRepeat("inputinput")); //input
console.log(maxRepeat("7inputinput")); //input
console.log(maxRepeat("inputinput7")); //input
console.log(maxRepeat("7inputinput7")); //input
console.log(maxRepeat("xxabcdyy")); //x
console.log(maxRepeat("XXinputinputYY")); //input
Note that for "xxabcdyy" you only get "x" back, as it returns the first string of maximum length.
这篇关于使用正则表达式在 JavaScript 中查找最长的重复子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!