使用正则表达式在 JavaScript 中查找最长的重复子字符串 [英] Find longest repeating substring in JavaScript using regular expressions

查看:40
本文介绍了使用正则表达式在 JavaScript 中查找最长的重复子字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在字符串中找到最长的重复字符串,用 JavaScript 实现并使用基于正则表达式的方法.

我有一个 PHP 实现,当直接移植到 JavaScript 时,它不起作用.

PHP 实现取自对问题查找最长的重复字符串?":

preg_match_all('/(?=((.+)(?:.*?\2)+))/s', $input, $matches, PREG_SET_ORDER);

这将填充 $matches[0][X](其中 X$matches[0] 的长度)在 $input 中找到的最长重复子串.我已经用许多输入字符串对此进行了测试,发现我确信输出是正确的.

JavaScript 中最近的直接端口是:

var 匹配 =/(?=((.+)(?:.*?\2)+))/.exec(input);

这不会给出正确的结果

<前>输入异常结果匹配[0][X]======================================================输入输入输入输入7inputinput 输入输入inputinput7 输入输入7输入输入7输入7XXinputinputYY 输入XX

我对正则表达式不够熟悉,无法理解这里使用的正则表达式在做什么.

我当然可以实现一些算法来找到最长的重复子串.在我尝试这样做之前,我希望不同的正则表达式会在 JavaScript 中产生正确的结果.

可以修改上面的正则表达式,以便在 JavaScript 中返回预期的输出吗?我承认这在单线中可能是不可能的.

解决方案

Javascript 匹配只返回第一个匹配 -- 你必须循环才能找到多个结果.一些测试表明这得到了预期的结果:

function maxRepeat(input) {var reg =/(?=((.+)(?:.*?\2)+))/g;var sub = "";//粘贴临时结果的地方var maxstr = "";//我们的最大长度重复字符串reg.lastIndex = 0;//因为 reg 以前存在,我们可能需要重置它sub = reg.exec(输入);//找到第一个重复的字符串while (!(sub == null)){if ((!(sub == null)) && (sub[2].length > maxstr.length)){maxstr = sub[2];}sub = reg.exec(输入);reg.lastIndex++;//从下一个位置开始搜索}返回最大字符串;}//为方便起见,我正在登录到控制台console.log(maxRepeat("aabcd"));//aaconsole.log(maxRepeat("inputinput"));//输入console.log(maxRepeat("7inputinput"));//输入console.log(maxRepeat("inputinput7"));//输入console.log(maxRepeat("7inputinput7"));//输入console.log(maxRepeat("xxabcdyy"));//Xconsole.log(maxRepeat("XXinputinputYY"));//输入

请注意,对于xxabcdyy",您只会返回x",因为它返回最大长度的第一个字符串.

I'd like to find the longest repeating string within a string, implemented in JavaScript and using a regular-expression based approach.

I have an PHP implementation that, when directly ported to JavaScript, doesn't work.

The PHP implementation is taken from an answer to the question "Find longest repeating strings?":

preg_match_all('/(?=((.+)(?:.*?\2)+))/s', $input, $matches, PREG_SET_ORDER);

This will populate $matches[0][X] (where X is the length of $matches[0]) with the longest repeating substring to be found in $input. I have tested this with many input strings and found am confident the output is correct.

The closest direct port in JavaScript is:

var matches = /(?=((.+)(?:.*?\2)+))/.exec(input);

This doesn't give correct results

input                  Excepted result   matches[0][X]
======================================================
inputinput             input             input
7inputinput            input             input
inputinput7            input             input
7inputinput7           input             7
XXinputinputYY         input             XX

I'm not familiar enough with regular expressions to understand what the regular expression used here is doing.

There are certainly algorithms I could implement to find the longest repeating substring. Before I attempt to do that, I'm hoping a different regular expression will produce the correct results in JavaScript.

Can the above regular expression be modified such that the expected output is returned in JavaScript? I accept that this may not be possible in a one-liner.

解决方案

Javascript matches only return the first match -- you have to loop in order to find multiple results. A little testing shows this gets the expected results:

function maxRepeat(input) {
 var reg = /(?=((.+)(?:.*?\2)+))/g;
 var sub = ""; //somewhere to stick temp results
 var maxstr = ""; // our maximum length repeated string
 reg.lastIndex = 0; // because reg previously existed, we may need to reset this
 sub = reg.exec(input); // find the first repeated string
 while (!(sub == null)){
  if ((!(sub == null)) && (sub[2].length > maxstr.length)){
   maxstr = sub[2];
  }
  sub = reg.exec(input);
  reg.lastIndex++; // start searching from the next position
 }
 return maxstr;
}

// I'm logging to console for convenience
console.log(maxRepeat("aabcd"));             //aa
console.log(maxRepeat("inputinput"));        //input
console.log(maxRepeat("7inputinput"));       //input
console.log(maxRepeat("inputinput7"));       //input
console.log(maxRepeat("7inputinput7"));      //input
console.log(maxRepeat("xxabcdyy"));          //x
console.log(maxRepeat("XXinputinputYY"));    //input

Note that for "xxabcdyy" you only get "x" back, as it returns the first string of maximum length.

这篇关于使用正则表达式在 JavaScript 中查找最长的重复子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆