如何有效地在JavaScript中的唯一字符串中找到相似的字符串？ [英] How to efficiently find similar strings in a unique string in JavaScript?

查看：107 发布时间：2020/6/3 20:59:48 javascript algorithm

本文介绍了如何有效地在JavaScript中的唯一字符串中找到相似的字符串？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

背景：我有一个列表，其中包含13,000个姓氏记录，其中一些是重复的，我想找出类似的名称来进行手动重复过程。

Background: I have a list that contains 13,000 records of human names, some of them are duplicates and I want to find out the similar ones to do the manual duplication process.

对于像这样的数组：

["jeff","Jeff","mandy","king","queen"]

什么是有效的获取方式：

What would be an efficient way to get:

[["jeff","Jeff"]]

说明 [ jeff， Jeff] ，因为它们的Levenshtein距离为1（可以像3那样变化）。

Explanation ["jeff","Jeff"] since their Levenshtein distance is 1(which can be variable like 3).

/* 
Working but a slow solution
*/
function extractSimilarNames(uniqueNames) {
  let similarNamesGroup = [];

  for (let i = 0; i < uniqueNames.length; i++) {
    //compare with the rest of the array
    const currentName = uniqueNames[i];

    let suspiciousNames = [];

    for (let j = i + 1; j < uniqueNames.length; j++) {
      const matchingName = uniqueNames[j];
      if (isInLevenshteinRange(currentName, matchingName, 1)) {
        suspiciousNames.push(matchingName);
        removeElementFromArray(uniqueNames, matchingName);
        removeElementFromArray(uniqueNames, currentName);
        i--;
        j--;
      }
    }
    if (suspiciousNames.length > 0) {
      suspiciousNames.push(currentName);
    }
  }
  return similarNamesGroup;
}

我想通过Levenshtein距离查找相似度，而不仅是小写/大写相似性

I want to find the similarity via Levenshtein distance, not only the lower/uppercase similarity

我已经找到了最快的Levenshtein之一实现，但仍然需要35分钟才能得到13000个项目列表的结果。

I already find one of the fastest Levenshtein implementation but it still takes me to 35 mins to get the result of 13000 items list.

如何有效地在JavaScript中的唯一字符串中找到相似的字符串？ [英] How to efficiently find similar strings in a unique string in JavaScript?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

如何有效地在JavaScript中的唯一字符串中找到相似的字符串？ [英] How to efficiently find similar strings in a unique string in JavaScript?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭