获取字符串列表中最接近的字符串匹配项. [英] Get the nearest Match of the string in list of strings.

查看:68
本文介绍了获取字符串列表中最接近的字符串匹配项.的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



假设string1 = harlishr并具有字符串列表harish,reddy,kiran,aswanth,veeresh ...
现在我想要的是将string1与列表中的所有字符串进行比较,并返回string1的最近匹配项(即示例中的harish)

我尝试了LevenshteinDistance算法,但是与之不同,我们需要传递两个字符串,它将返回更改了多少个字符,但是在我的情况下,它是完全不同的.

谁能解释一下如何在C#4.0中做到这一点.



谢谢,

Harish

Hi,

Let assume string1 = harlishr and having list of strings harish, reddy, kiran, aswanth, veeresh...
Now what i want is compare the string1 with all the strings in the list and return the nearest match for the string1 (i.e harish in example)

I tried LevenshteinDistance algorithm but it is different we need to pass two strings and it will return how many characters changed, but in my case it is totally different.

Can anyone explain how can we do that in C#4.0.



Thanks,

Harish

推荐答案



请参阅我的扩展搜索方法 [
Hi,

See my Extended search method[^]Tip/Trick

Hope this will solve your problem

thanks
-Amit


您好,我认为LevenshteinDistance算法可以解决您的问题.让我解释一下:

该算法的默认实现:
Hello I think that the LevenshteinDistance algorithm could be the solution to your problem. let me explain:

the default implementation of the algorithm:
static class LevenshteinDistance
{
    /// <summary>
    /// Compute the distance between two strings.
    /// </summary>
    public static int Compute(string s, string t)
    {
        int n = s.Length;
        int m = t.Length;
        int[,] d = new int[n + 1, m + 1];

        // Step 1
        if (n == 0)
        {
            return m;
        }

        if (m == 0)
        {
            return n;
        }

        // Step 2
        for (int i = 0; i <= n; d[i, 0] = i++)
        {
        }

        for (int j = 0; j <= m; d[0, j] = j++)
        {
        }

        // Step 3
        for (int i = 1; i <= n; i++)
        {
            //Step 4
            for (int j = 1; j <= m; j++)
            {
                // Step 5
                int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;

                // Step 6
                d[i, j] = Math.Min(
                    Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
                    d[i - 1, j - 1] + cost);
            }
        }
        // Step 7
        return d[n, m];
    }
}



接下来是具有某种逻辑的简单按钮:



next a simple button with some kind of logic:

private void button1_Click(object sender, EventArgs e)
       {
           string baseString = "Basic";
           List<string> lstStringsToCheck = new List<string>
                                            {
                                                "Bas",
                                                "Test",
                                                "Prod",
                                                "Basist",
                                                "Bar",
                                                "Result",
                                                "Another string"
                                            };
           Dictionary<string,int> resultset = new Dictionary<string, int>();
           foreach (string stringtoTest in lstStringsToCheck)
           {
              resultset.Add(stringtoTest,LevenshteinDistance.Compute(baseString, stringtoTest));
           }
           //get the minimum number of modifications needed to arrive at the basestring
           int minimumModifications = resultset.Min(c=>c.Value);
           //gives you a list with all strings that need a minimum of modifications to become the
           //same as the baseString
           var closestlist = resultset.Where(c => c.Value == minimumModifications);
       }



如您所见,我从baseString开始.在您的示例中为"harlishr".接下来,我创建一个包含所有可能字符串的列表.
进一步,我创建了一个字典,将在其中保存LevenshteinDistance.compute方法的结果.
我列表中的foreach字符串将计算距离并将其添加到结果字典中.
之后我取最小值.
我选择所有具有最小值的结果(这是最接近原始值的字符串或字符串列表)

希望对您有帮助

亲切的问候



as you can see I start with a baseString. this is in your example "harlishr". next I create a list with all possible strings.
further I''ve created a dictionary where I''ll save the result from the LevenshteinDistance.compute method.
foreach string in my list I compute the distance and add this to the result dictionary.
afterwards I take the minimum value.
I select all the results that have this minimum value (this is the string or list of strings that is/are the closest to the original)

hope this helps

Kind regards


这篇关于获取字符串列表中最接近的字符串匹配项.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆