在JavaScript中找出字串之间的差异 [英] Finding difference between strings in Javascript

查看:82
本文介绍了在JavaScript中找出字串之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想比较两个字符串(前后一个字符串),并准确检测它们之间的位置和变化。

I'd like to compare two strings (a before and after) and detect exactly where and what changed between them.

对于任何更改,我想知道:

For any change, I want to know:


  1. 更改的开始位置(包括0)。

  2. 更改的结束位置相对于上一个文本的更改(包括从0开始)

  3. 更改

假设字符串一次只能更改一个位置(例如,从不 B il l -> K il n )。

Assume that strings will change in only one place at a time (for example, never "Bill" -> "Kiln").

此外,我还需要起点和终点位置来反映变化的类型:

Additionally, I need the start and end positions to reflect the type of change:


  • 如果删除,则开始位置和结束位置应分别为已删除文本的开始位置和结束位置

  • 如果进行替换,则开始位置和结束位置应分别是已删除文本的开始和结束位置(更改将是添加文本)

  • 如果插入时,开始位置和结束位置应相同;文本的入口点

  • 如果没有更改,则将起始位置和结束位置保持为零,并进行空更改

  • If deletion, the start and end position should be the start and end positions of the deleted text, respectively
  • If replacement, the start and end position should be the start and end positions of the "deleted" text, respectively (the change will be the "added" text)
  • If insertion, the start and end positions should be the same; the entry point of the text
  • If no change, let start and end positions remain zero, with an empty change

例如:

"0123456789" -> "03456789"  
Start: 1, End: 2, Change: "" (deletion)

"03456789" -> "0123456789"  
Start: 1, End: 1, Change: "12" (insertion)

"Hello World!" -> "Hello Aliens!"  
Start: 6, End: 10, Change: "Aliens" (replacement)

"Hi" -> "Hi"  
Start: 0, End: 0, Change: "" (no change)

我能够在某种程度上检测到更改后的文本的位置,但是它不能在所有情况下都起作用,因为要准确地执行此操作,我需要知道进行了哪种更改。

I was able to somewhat detect the positions of the changed text, but it doesn't work in all cases because in order to do that accurately, I need to know what kind of change is made.

var OldText = "My edited string!";
var NewText = "My first string!";

var ChangeStart = 0;
var NewChangeEnd = 0;
var OldChangeEnd = 0;
console.log("Comparing start:");
for (var i = 0; i < NewText.length; i++) {
    console.log(i + ": " + NewText[i] + " -> " + OldText[i]);
    if (NewText[i] != OldText[i]) {
        ChangeStart = i;
        break;
    }
}
console.log("Comparing end:");
// "Addition"?
if (NewText.length > OldText.length) {
    for (var i = 1; i < NewText.length; i++) {
        console.log(i + "(N: " + (NewText.length - i) + " O: " + (OldText.length - i) + ": " + NewText.substring(NewText.length - i, NewText.length - i + 1) + " -> " + OldText.substring(OldText.length - i, OldText.length - i + 1));
        if (NewText.substring(NewText.length - i, NewText.length - i + 1) != OldText.substring(OldText.length - i, OldText.length - i + 1)) {
            NewChangeEnd = NewText.length - i;
            OldChangeEnd = OldText.length - i;
            break;
        }
    }
// "Deletion"?
} else if (NewText.length < OldText.length) {
    for (var i = 1; i < OldText.length; i++) {
        console.log(i + "(N: " + (NewText.length - i) + " O: " + (OldText.length - i) + ": " + NewText.substring(NewText.length - i, NewText.length - i + 1) + " -> " + OldText.substring(OldText.length - i, OldText.length - i + 1));
        if (NewText.substring(NewText.length - i, NewText.length - i + 1) != OldText.substring(OldText.length - i, OldText.length - i + 1)) {
            NewChangeEnd = NewText.length - i;
            OldChangeEnd = OldText.length - i;
            break;
        }
    }
// Same length...
} else {
    // Do something
}
console.log("Change start: " + ChangeStart);
console.log("NChange end : " + NewChangeEnd);
console.log("OChange end : " + OldChangeEnd);
console.log("Change: " + OldText.substring(ChangeStart, OldChangeEnd + 1));

我如何知道是否进行了插入,删除或替换?

我已经搜索并提出了很少 其他类似问题,但它们似乎无济于事。

I've searched and came up with a few other similar questions, but they don't seem to help.

推荐答案

I已经检查了您的代码,并且您匹配字符串的逻辑对我来说很有意义。它将正确记录 ChangeStart NewChangeEnd OldChangeEnd 以及算法一切顺利。您只想知道是否发生了插入删除替换

I have gone through your code and your logic for matching string makes sense to me. It logs ChangeStart, NewChangeEnd and OldChangeEnd correctly and the algorithm flows alright. You just want to know if an insertion, deletion or replacement took place. Here's how I would go about it.

首先,您需要确保在出现不匹配的第一点后,即 ChangeStart 然后从头开始遍历字符串时,索引不应越过 ChangeStart

First of all, you need to make sure that after you have got the first point of mis-match i.e. ChangeStart when you then traverse the strings from the end, the index shouldn't cross ChangeStart.

我给你举个例子。考虑以下字符串:

I'll give you an example. Consider the following strings:

 var NewText = "Hello Worllolds!";
 var OldText = "Hello Worlds!";

 ChangeStart -> 10 //Makes sense
 OldChangeEnd -> 8
 NewChangeEnd -> 11

 console.log("Change: " + NewText.substring(ChangeStart, NewChangeEnd + 1)); 
 //Ouputs "lo"

这种情况下的问题是从后面,流程是这样的:

The problem in this case is when it starts matching from the back, the flow is something like this:

 Comparing end: 
  1(N: 12 O: 12: ! -> !) 
  2(N: 11 O: 11: s -> s) 
  3(N: 10 O: 10: d -> d)  -> You need to stop here!

 //Although there is not a mismatch, but we have reached ChangeStart and 
 //we have already established that characters from 0 -> ChangeStart-1 match
 //That is why it outputs "lo" instead of "lol"

假设,我刚才说的很有意义,您只需要像这样修改循环的

Assuming, what I just said makes sense, you just need to modify your for loops like so:

 if (NewText.length > OldText.length) {
 for (var i = 1; i < NewText.length && ((OldText.length-i)>=ChangeStart); i++) {
  ...

    NewChangeEnd = NewText.length - i -1;
    OldChangeEnd = OldText.length - i -1;
  if(//Mismatch condition reached){
         //break..That code is fine.
    }
 }

此条件-> ( OldText.length-i)> = ChangeStart 处理我提到的异常,因此,如果达到此条件, for 循环会自动终止。但是,就像我提到的那样,在某些情况下,如我刚刚演示的那样,在遇到不匹配之前可能已经达到此条件。因此,您需要将 NewChangeEnd OldChangeEnd 的值更新为小于匹配值的1 。如果不匹配,则可以适当地存储值。

This condition -> (OldText.length-i)>=ChangeStart takes care of the anomaly that I mentioned and therefore the for loop automatically terminates if this condition is reached. However, just as I mentioned there might be situations where this condition is reached before a mis-match is encountered like I just demonstrated. So you need to update values of NewChangeEnd and OldChangeEnd as 1 less than the matched value. In case of a mis-match, you store the values appropriately.

代替 else -if ,我们可以只需在我们知道 NewText.length>的情况下包装这两个条件即可。 OldText.length 绝对不是 true ,即它是替换删除。再次 NewText.length>根据您的示例,OldText.length 也表示它可以是替换插入。因此 else 可能类似于:

Instead of an else -if we could just wrap those two conditions in a situation where we know NewText.length > OldText.length is definitely not true i.e. it is either a replacement or a deletion. Again NewText.length > OldText.length also means it could be a replacement or an insertion as per your examples, which makes sense. So the else could be something like:

else {
for (var i = 1; i < OldText.length && ((OldText.length-i)>=ChangeStart); i++) { 

    ...
    NewChangeEnd = NewText.length - i -1;
    OldChangeEnd = OldText.length - i -1;
  if(//Mismatch condition reached){
         //break..That code is fine.
    }
 }

如果您了解到目前为止的微小变化,请确定具体情况非常简单:

If you have understood the minor changes thus far, identifying the specific cases is really simple:


  1. 删除-条件-> ChangeStart> NewChangeEnd 。已从 ChangeStart->中删除字符串。 OldChangeEnd

  1. Deletion - Condition -> ChangeStart > NewChangeEnd. Deleted string from ChangeStart -> OldChangeEnd.

已删除的文本-> OldText.substring(ChangeStart,OldChangeEnd + 1);


  1. 插入-条件-> ChangeStart> OldChangeEnd 。在 ChangeStart 处插入字符串。

  1. Insertion - Condition -> ChangeStart > OldChangeEnd. Inserted string at ChangeStart.

插入的文本-> NewText.substring(ChangeStart,NewChangeEnd +1);


  1. 替换-如果 NewText!= OldText 并且满足上述两个条件,则替换。

  1. Replacement - If NewText != OldText and the above two conditions are not met, then it is a replacement.

已替换的旧字符串中的文本-> OldText.substring(ChangeStart,OldChangeEnd + 1);

Text in old string that got replaced -> OldText.substring(ChangeStart, OldChangeEnd + 1);

替换文本-> NewText.substring(ChangeStart,NewChangeEnd + 1);

已替换 -> <$ c的 OldText 的开始和结束位置$ c> ChangeStart-> OldChangeEnd

Start and end positions in the OldText that got replaced -> ChangeStart -> OldChangeEnd

我创建了 jsfiddle 结合了我在您的代码中提到的更改。您可能需要检查一下。希望它能使您朝正确的方向开始。

I have created a jsfiddle incorporating the changes that I have mentioned in your code. You might want to check it out. Hope it gets you started in the right direction.

这篇关于在JavaScript中找出字串之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆