查找JavaScript中两个字符串之间的差异 [英] Find difference between two strings in JavaScript

查看:73
本文介绍了查找JavaScript中两个字符串之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要找到两个字符串之间的区别。

  const string1 ='lebronjames'; 
const string2 ='lebronnjames';

预期的输出是找到多余的n并将其记录到控制台。 / p>

有没有办法在JavaScript中做到这一点?

解决方案

另一个对于更复杂的差异检查,选项是利用PatienceDiff算法。我将此算法移植到了Javascript上...



https ://github.com/jonTrent/PatienceDiff



...尽管该算法通常用于逐行比较文本(例如(例如计算机程序),它仍然可以用于逐个字符进行比较。例如,要比较两个字符串,可以执行以下操作……

  let a = thelebronnjamist; 
let b =勒布朗詹姆斯;

让差= patienceDiff(a.split(),b.split()));

...其中 difference.lines 设置为具有比较结果的数组...

  difference.lines:Array(19)

0:{行: t,aIndex:0,bIndex:0}
1:{行: h,aIndex:1,bIndex:1}
2:{line: e,aIndex:2,bIndex:2}
3:{line:,aIndex:-1,bIndex:3}
4:{line: l,aIndex:3, bIndex:4}
5:{line: e,aIndex:4,bIndex:5}
6:{line: b,aIndex:5,bIndex:6}
7:{line: r,aIndex:6,bIndex:7}
8:{line: o,aIndex:7,bIndex:8}
9:{line: n ,aIndex:8,bIndex:9}
10:{line: n,aIndex:9,bIndex:-1}
11:{line:,aIndex:-1,bIndex: 10}
12:{行: j,aIndex:10,bIndex:11}
13:{行: a,aIndex:11,bIndex:12}
14: {line: m,aIndex:12,bIndex:13}
15:{line: i,aIndex:13,bIndex:-1}
16:{line: e, aIndex:-1,bIndex:14}
17:{line: s,aIndex:14,bIndex:15}
18:{line: t,aIndex :15,bIndex:-1}

任何 aIndex === -1 bIndex === -1 表示两个字符串之间存在差异。具体来说...




  • 元素3表示在 b 中找到了字符在位置3中。

  • 元素10指示在位置9中的 a 中找到字符 n。

  • 元素11指示字符在位置10的 b 中。

  • 元素15指示字符在位置13的 a 中找到i。

  • 元素16表示在 b 在第14位。

  • 元素18表示在以下位置的 a 中找到了字符 t位置15。



请注意,PatienceDiff算法对于比较两个相似的文本或字符串块很有用。它不会告诉您是否进行了基本编辑。例如,以下...

  let a =詹姆斯·勒布朗; 
let b =勒布朗·詹姆斯;

让差= patienceDiff(a.split(),b.split()));

...返回 difference.lines 包含...

  difference.lines:Array(18)

0:{line: j ,aIndex:0,bIndex:-1}
1:{line: a,aIndex:1,bIndex:-1}
2:{line: m,aIndex:2, bIndex:-1}
3:{line: e,aIndex:3,bIndex:-1}
4:{line: s,aIndex:4,bIndex:-1}
5:{行:,aIndex:5,bIndex:-1}
6:{行: l,aIndex:6,bIndex:0}
7:{行: e,aIndex:7,bIndex:1}
8:{line: b,aIndex:8,bIndex:2}
9:{line: r,aIndex:9, bIndex:3}
10:{line: o,aIndex:10,bIndex:4}
11:{line: n,aIndex:11,bIndex:5}
12:{line:,aIndex:-1,bIndex:6}
13:{line: j,aIndex:-1,bIndex:7}
14:{line: a ,aIndex:-1,bIndex:8}
15:{line: m,aIndex:-1,bIndex:9}
16:{line: e,aIndex:-1 ,bIndex:10}
17:{line: s,aIndex:-1,bIndex:11}

注意,PatienceDiff不执行t报告姓氏和名字的交换,而是提供一个结果,显示从 a 中删除​​了哪些字符以及向 b中添加了哪些字符最终得到 b 的结果。



编辑:已添加新的算法称为 patienceDiffPlus



在对以上提供的最后一个示例进行了仔细研究之后,该示例显示了PatienceDiff在识别行可能移动了,这让我想到,有一种优雅的方法可以使用PatienceDiff算法来确定是否确实有任何行移动了,而不仅仅是显示删除和添加。



简而言之,我在PatienceDiff.js文件的底部添加了 patienceDiffPlus 算法(到上面确定的GitHub存储库中)。 patienceDiffPlus 算法从最初的 patienceDiff 算法中删除已删除的aLines []并添加bLines [],然后运行它们再次使用 patienceDiff 算法。即, patienceDiffPlus 正在寻找可能移动的行的最长公共子序列,因此它将其记录在原始的 patienceDiff 结果中。 patienceDiffPlus 算法继续执行此操作,直到找不到更多的移动行为止。



现在,使用 patienceDiffPlus ,以下比较...

  let a =詹姆斯·勒布朗; 
let b =勒布朗·詹姆斯;

让差= patienceDiffPlus(a.split(),b.split()));

...返回 difference.lines 包含...

  difference.lines:Array(18)

0:{line: j ,aIndex:0,bIndex:-1,已移动:true}
1:{line: a,aIndex:1,bIndex:-1,已移动:true}
2:{line: m,aIndex:2,bIndex:-1,已移动:true}
3:{line: e,aIndex:3,bIndex:-1,已移动:true}
4:{行: s,aIndex:4,bIndex:-1,已移动:true}
5:{行:,aIndex:5,bIndex:-1,已移动:true}
6: {line: l,aIndex:6,bIndex:0}
7:{line: e,aIndex:7,bIndex:1}
8:{line: b,aIndex :8,bIndex:2}
9:{line: r,aIndex:9,bIndex:3}
10:{line: o,aIndex:10,bIndex:4}
11:{line: n,aIndex:11,bIndex:5}
12:{line:,aIndex:5,bIndex:6,已移动:true}
13: {行: j,aIndex:0,bIndex:7,已移动:true}
14:{行: a,aIndex:1,bIndex:8,已移动:true}
15: {line: m,aIndex:2,bIndex:9,已移动:true}
16:{line: e,aIndex:3,bI ndex:10,已移动:true}
17:{line: s,aIndex:4,bIndex:11,已移动:true}

请注意添加了 moved 属性,该属性标识是否可能移动了一行(或本例中的字符)。同样, patienceDiffPlus 只是匹配已删除的aLines []和添加的bLines [],因此不能保证这些行实际上已移动,但是很有可能它们是确实动了。


I need to find difference between two strings.

const string1 = 'lebronjames';
const string2 = 'lebronnjames';

The expected output is to find the extra n and log it to the console.

Is there any way to do this in JavaScript?

解决方案

Another option, for more sophisticated difference checking, is to make use of the PatienceDiff algorithm. I ported this algorithm to Javascript at...

https://github.com/jonTrent/PatienceDiff

...which although the algorithm is typically used for line-by-line comparison of text (such as computer programs), it can still be used for comparison character-by-character. Eg, to compare two strings, you can do the following...

let a = "thelebronnjamist";
let b = "the lebron james";

let difference = patienceDiff( a.split(""), b.split("") );

...with difference.lines being set to an array with the results of the comparison...

difference.lines: Array(19)

0: {line: "t", aIndex: 0, bIndex: 0}
1: {line: "h", aIndex: 1, bIndex: 1}
2: {line: "e", aIndex: 2, bIndex: 2}
3: {line: " ", aIndex: -1, bIndex: 3}
4: {line: "l", aIndex: 3, bIndex: 4}
5: {line: "e", aIndex: 4, bIndex: 5}
6: {line: "b", aIndex: 5, bIndex: 6}
7: {line: "r", aIndex: 6, bIndex: 7}
8: {line: "o", aIndex: 7, bIndex: 8}
9: {line: "n", aIndex: 8, bIndex: 9}
10: {line: "n", aIndex: 9, bIndex: -1}
11: {line: " ", aIndex: -1, bIndex: 10}
12: {line: "j", aIndex: 10, bIndex: 11}
13: {line: "a", aIndex: 11, bIndex: 12}
14: {line: "m", aIndex: 12, bIndex: 13}
15: {line: "i", aIndex: 13, bIndex: -1}
16: {line: "e", aIndex: -1, bIndex: 14}
17: {line: "s", aIndex: 14, bIndex: 15}
18: {line: "t", aIndex: 15, bIndex: -1}

Wherever aIndex === -1 or bIndex === -1 is an indication of a difference between the two strings. Specifically...

  • Element 3 indicates that character " " was found in b in position 3.
  • Element 10 indicates that character "n" was found in a in position 9.
  • Element 11 indicates that character " " was found in b in position 10.
  • Element 15 indicates that character "i" was found in a in position 13.
  • Element 16 indicates that character "e" was found in b in position 14.
  • Element 18 indicates that character "t" was found in a in position 15.

Note that the PatienceDiff algorithm is useful for comparing two similar blocks of text or strings. It will not tell you if basic edits have occurred. Eg, the following...

let a = "james lebron";
let b = "lebron james";

let difference = patienceDiff( a.split(""), b.split("") );

...returns difference.lines containing...

difference.lines: Array(18)

0: {line: "j", aIndex: 0, bIndex: -1}
1: {line: "a", aIndex: 1, bIndex: -1}
2: {line: "m", aIndex: 2, bIndex: -1}
3: {line: "e", aIndex: 3, bIndex: -1}
4: {line: "s", aIndex: 4, bIndex: -1}
5: {line: " ", aIndex: 5, bIndex: -1}
6: {line: "l", aIndex: 6, bIndex: 0}
7: {line: "e", aIndex: 7, bIndex: 1}
8: {line: "b", aIndex: 8, bIndex: 2}
9: {line: "r", aIndex: 9, bIndex: 3}
10: {line: "o", aIndex: 10, bIndex: 4}
11: {line: "n", aIndex: 11, bIndex: 5}
12: {line: " ", aIndex: -1, bIndex: 6}
13: {line: "j", aIndex: -1, bIndex: 7}
14: {line: "a", aIndex: -1, bIndex: 8}
15: {line: "m", aIndex: -1, bIndex: 9}
16: {line: "e", aIndex: -1, bIndex: 10}
17: {line: "s", aIndex: -1, bIndex: 11}

Notice that the PatienceDiff does not report the swap of the first and last name, but rather, provides a result showing what characters were removed from a and what characters were added to b to end up with the result of b.

EDIT: Added new algorithm dubbed patienceDiffPlus.

After mulling over the last example provided above that showed a limitation of the PatienceDiff in identifying lines that likely moved, it dawned on me that there was an elegant way of using the PatienceDiff algorithm to determine if any lines had indeed likely moved rather than just showing deletions and additions.

In short, I added the patienceDiffPlus algorithm (to the GitHub repo identified above) to the bottom of the PatienceDiff.js file. The patienceDiffPlus algorithm takes the deleted aLines[] and added bLines[] from the initial patienceDiff algorithm, and runs them through the patienceDiff algorithm again. Ie, patienceDiffPlus is seeking the Longest Common Subsequence of lines that likely moved, whereupon it records this in the original patienceDiff results. The patienceDiffPlus algorithm continues this until no more moved lines are found.

Now, using patienceDiffPlus, the following comparison...

let a = "james lebron";
let b = "lebron james";

let difference = patienceDiffPlus( a.split(""), b.split("") );

...returns difference.lines containing...

difference.lines: Array(18)

0: {line: "j", aIndex: 0, bIndex: -1, moved: true}
1: {line: "a", aIndex: 1, bIndex: -1, moved: true}
2: {line: "m", aIndex: 2, bIndex: -1, moved: true}
3: {line: "e", aIndex: 3, bIndex: -1, moved: true}
4: {line: "s", aIndex: 4, bIndex: -1, moved: true}
5: {line: " ", aIndex: 5, bIndex: -1, moved: true}
6: {line: "l", aIndex: 6, bIndex: 0}
7: {line: "e", aIndex: 7, bIndex: 1}
8: {line: "b", aIndex: 8, bIndex: 2}
9: {line: "r", aIndex: 9, bIndex: 3}
10: {line: "o", aIndex: 10, bIndex: 4}
11: {line: "n", aIndex: 11, bIndex: 5}
12: {line: " ", aIndex: 5, bIndex: 6, moved: true}
13: {line: "j", aIndex: 0, bIndex: 7, moved: true}
14: {line: "a", aIndex: 1, bIndex: 8, moved: true}
15: {line: "m", aIndex: 2, bIndex: 9, moved: true}
16: {line: "e", aIndex: 3, bIndex: 10, moved: true}
17: {line: "s", aIndex: 4, bIndex: 11, moved: true}

Notice the addition of the moved attribute, which identifies whether a line (or character in this case) was likely moved. Again, patienceDiffPlus simply matches the deleted aLines[] and added bLines[], so there is no guarantee that the lines were actually moved, but there is a strong likelihood that they were indeed moved.

这篇关于查找JavaScript中两个字符串之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆