IsStringMadeOf方法的最快技术是什么? [英] What is the fastest technique for IsStringMadeOf method?

查看:146
本文介绍了IsStringMadeOf方法的最快技术是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个被称为千万次的方法。由于其复杂性,我的代码运行缓慢。有没有更好的技术?允许使用不安全的方法。



谢谢。





简单检查输入字符串的字母是否可以由其他字符串的字母组成。



我当前的方法;



  public   static   bool  IsStringMadeOf( this   string  str,字符串 来自
{
for int i = 0 ; i < str.Length; i ++)
{
int index = 来自 .IndexOf(STR [1]);
if (index!= -1)
{
来自 = 来自 .Remove(index, 1 );
}
其他
{
返回 ;
}
}
返回 true ;
}

Console.WriteLine(IsStringMadeOf( SLOW WSOL));
// True

Console.WriteLine(IsStringMadeOf(< span class =code-string> SLOW WTOL));
// False

Console.WriteLine(IsStringMadeOf(< span class =code-string>
ASIA XZYABSTRIB));
// False

Console.WriteLine(IsStringMadeOf(< span class =code-string>
ASIA XZYABSTRIAB));
// True







编辑



基准测试结果:

http://pastebin.com/uJcYNAe1 [ ^ ]



我对结果不满意但是我会选择不安全的方法,最好的它的效果最好,但一般情况下它与我的相同。



EDIT2



我尝试从非托管库中调用方法,但没有得到更好的结果。



http://pastebin.com/uzkXhFy4 [ ^ ]

解决方案

你想要多疯狂?

如果你想要一些不安全的代码你可以快〜70%通过使用指针;

public unsafe static bool IsStringMadeOf3( this string str , string 来自){
char [] copy = new char [来自。长度]。
来自 .CopyTo( 0 ,copy, 0 ,copy.Length);
fixed (char * s = str){
fixed (char * f = copy){
var l = str.Length;
for int i = 0 ; i < l; ++ i){
var index = .IndexOf(S [1]);
if (index!= -1)
f [index] = ' < span class =code-string> \0'
;
else
return false < /跨度>;
}
返回 true ;
}
}
}



它本质上是你的算法,但它从 c>(或者更确切地说是它的副本)而不是在它上面做 String.Remove



我可能会远离这个解决方案,指针永远不会很好玩,尤其不是在C#中(虽然它们很棒)。



希望这会有所帮助,

Fredrik


使用Linq(需要.NET 3.5或更高版本)来计算这种方法会很有趣:

  private   bool  IsCharMatch( string  stringToMatch, string  stringToTest)
{
return
// 实现长度相等的测试?
// stringToMatch.Length == stringToTest.Length
// &&
stringToMatch.OrderBy(s = > s).SequenceEqual(stringToTest.OrderBy(s = > s));
}

编辑:



这是一个实验性方法,使用Linq'GroupBy和'ToDictionary方法,解决这些问题(希望如此) )我们从OP通过对话了解到所需结果:

  private   bool  IsCharMatch2( string  stringToMatch, string  stringToTest)
{
if (stringToMatch.Length > stringToTest.Length) return false ;

// 构造字典,其中每个键都是一个字符
// 每个值都是字符的出现次数
//

字典< char,int> seqSource = stringToMatch.GroupBy(c = > c)
.ToDictionary(key = > key.Key, value = > value .Count之间());

字典< char,int> seqTest = stringToTest.GroupBy(c = > c)
.ToDictionary(key = > key.Key, value = > value .Count之间());

int count;

foreach var kvp in seqSource)
{
if (!seqTest.TryGetValue(kvp.Key, out count)|| kvp.Value > count) return ;
}

返回 true ;
}

以下是一些简单测试的结果:

  //   true  
bool match1 = IsCharMatch2( hello olelh);
// true
bool match2 = IsCharMatch2( A AA);
// false
bool match3 = IsCharMatch2( AA A);
// false
bool match4 = IsStringMadeOf( eroerfwyhpw whyfore);
// false
bool match5 = IsStringMadeOf( eroerfwyhw whyfore);
// true
bool match6 = IsStringMadeOf( eroerfwyhw wheyrxf123oryeeewz);


声音,就像你在寻找'相似度算法',它提供了一种比较字符串和返回结果作为相似性百分比的方法;)



BTW:几年前我问过类似问题 [ ^ ]。



请参阅:

http://stackoverflow.com/questions/653157/a-better-similarity-ranking-algorithm-for-variable-length-strings [<一个href =http://stackoverflow.com/questions/653157/a-better-similarity-ranking-algorithm-for-variable-length-stringstarget =_ blanktitle =New Window> ^ ]

http://stackoverflow.com/questions/3576211/string-similarity-algorithms [ ^ ]

http://en.wikipedia.org/wiki/String_metric [ ^ ]


I have a method which is called thousands time. Because of its complexity, my code is working slow. Is there any better technique? Unsafe methods are allowed.

Thank you.


It simply checks letters of input string whether it could be made of the letters of other string.

My current method;

public static bool IsStringMadeOf(this string str, string from)
{
    for (int i = 0; i < str.Length; i++)
    {
        int index = from.IndexOf(str[i]);
        if (index != -1)
        {
            from = from.Remove(index, 1);
        }
        else
        {
            return false;
        }
    }
    return true;
}

Console.WriteLine(IsStringMadeOf("SLOW","WSOL"));
//True

Console.WriteLine(IsStringMadeOf("SLOW","WTOL"));
//False

Console.WriteLine(IsStringMadeOf("ASIA","XZYABSTRIB"));
//False

Console.WriteLine(IsStringMadeOf("ASIA","XZYABSTRIAB"));
//True




EDIT

Benchmark Results:
http://pastebin.com/uJcYNAe1[^]

I am not satisfied with results but I will go for unsafe method, in best case it works best but in average case it is same as mine.

EDIT2

I tried calling method from unmanaged library but didn't get a better result.

http://pastebin.com/uzkXhFy4[^]

解决方案

How crazy do you want to go?
If you're up for some un-safe code you can go ~70% faster by using pointers;

public unsafe static bool IsStringMadeOf3(this string str, string from) {
    char[] copy = new char[from.Length];
    from.CopyTo(0, copy, 0, copy.Length);
    fixed (char* s = str) {
        fixed (char* f = copy) {
            var l = str.Length;
            for (int i = 0; i < l; ++i) {
                var index = from.IndexOf(s[i]);
                if (index != -1)
                    f[index] = '\0';
                else
                    return false;
            }
            return true;
        }
    }
}


It's essentially your algorithm but it modifies the from (or rather a copy of it) in place instead of doing a String.Remove on it.

I'd probably stay away from this solution, pointers are never nice to play with, especially not in C# (although they are awesome).

Hope this helps,
Fredrik


It would be interesting to time this approach using Linq (requires .NET 3.5 or later):

private bool IsCharMatch(string stringToMatch, string stringToTest)
{
    return 
        // implement test for length equality ?
        //stringToMatch.Length == stringToTest.Length
        //&&
        stringToMatch.OrderBy(s => s).SequenceEqual(stringToTest.OrderBy(s => s));
}

Edit:

Here's an "experimental" approach, using Linq 'GroupBy and 'ToDictionary methods, that addresses (hopefully) what we have learned from the OP via dialogue about the results desired:

private bool IsCharMatch2(string stringToMatch, string stringToTest)
{
    if (stringToMatch.Length > stringToTest.Length) return false;

    // construct Dictionaries where each Key is a character
    // and each Value is the number of occurrences of the character
    // in the string

    Dictionary<char, int> seqSource = stringToMatch.GroupBy(c => c)
        .ToDictionary(key => key.Key, value => value.Count());

    Dictionary<char, int> seqTest = stringToTest.GroupBy(c => c)
        .ToDictionary(key => key.Key, value => value.Count());

    int count;

    foreach(var kvp in seqSource)
    {
        if (! seqTest.TryGetValue(kvp.Key, out count) || kvp.Value > count) return false;
    }

    return true;
}

Here's the result of some simple tests:

// true
bool match1 = IsCharMatch2("hello", "olelh");
// true
bool match2 = IsCharMatch2("A", "AA");
// false
bool match3 = IsCharMatch2("AA", "A");
// false
bool match4 = IsStringMadeOf("eroerfwyhpw", "whyfore");
// false
bool match5 = IsStringMadeOf("eroerfwyhw", "whyfore");
// true
bool match6 = IsStringMadeOf("eroerfwyhw", "wheyrxf123oryeeewz");


Sounds, like you're looking for 'similarity algorithm', which provides a way to compare strings and return result as a percentage of similarity ;)

BTW: Few years ago i asked similar question[^].

Please, see:
http://stackoverflow.com/questions/653157/a-better-similarity-ranking-algorithm-for-variable-length-strings[^]
http://stackoverflow.com/questions/3576211/string-similarity-algorithms[^]
http://en.wikipedia.org/wiki/String_metric[^]


这篇关于IsStringMadeOf方法的最快技术是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆