优化经常用字谜功能 [英] Optimizing very often used anagram function

查看:125
本文介绍了优化经常用字谜功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写一个函数,确定两个单词是否字谜。字 A是字B的字谜,如果你可以通过重新构建字B出A的 字母,例如:

I have written a function that determines whether two words are anagrams. Word A is an anagram of word B if you can build word B out of A just by rearranging the letters, e.g.:

lead is anagram of deal

这是我的功能:

bool is_anagram(std::string const & s1, std::string const & s2)
{
    auto check = [](std::string const & x)
    {
        std::map<char, unsigned> counter;
        for(auto const & c : x)
        {
            auto it = counter.find(c);
            if(it == counter.end())
                counter[c] = 1;
            else
                ++counter[c];
        }
        return counter;
    };

    return check(s1) == check(s2);
}

这工作得很好,但作为单词数增加(和使用此功能 在我的应用程序数百万次),它很快就成为了一个主要 瓶颈我的应用程序。

This works fine, but as the number of words increases (and this function is used several million times within my application), it very soon became a major bottleneck of my application.

有没有人对如何加快这个功能了一个想法?

Does anyone have an idea of how to speed this function up?

推荐答案

地图创建和您的来电的std ::地图::找到的迭代中, 是相当昂贵的。

The map creation and your call to std::map::find within the iteration, is quite expensive.

在这种情况下, 您可以使用一个事实,即的std ::在许多方面就像一个字符串行为 的std ::矢量&lt;焦炭&GT; ,这意味着您可以使用排序的std ::排序

In this case, you can use the fact that std::string behaves in many regards like a std::vector<char>, meaning that you can sort it using std::sort:

bool is_anagram(std::string s1, std::string s2)
{
    std::sort(s1.begin(), s1.end());
    std::sort(s2.begin(), s2.end());
    return s1 == s2;
}

相反,正在创建的,我走的是字符串的副本两个地图 (按价值计算,而不是常量引用传递它们),并整理他们,所以

Instead of the two maps that you are creating, I am taking a copy of the strings (passing them by value instead of const reference) and sorting them, so

sort("lead") => "adel"
sort("deal") => "adel"

这个变化应该已经被相当多的加快你的算法了。多一个 如果你喜欢比较随意,可能会极大地影响性能的事 也就是说:

This change should already speed your algorithm up by quite a bit. One more thing that may greatly affect the performance if you tend to compare arbitrary words:

bool is_anagram(std::string s1, std::string s2)
{
    if(s1.length() != s2.length())
        return false;
    /* as above */
}

如果两个字符串的长度不同,它显然不能是一个字谜。 的std ::字符串::长度()是一个非常快的操作(它需要储存 字符串大小反正),所以我们为我们节省了 O(N *日志(N))从喧嚣 排序的两个字符串。

If the length of the two strings differs, it obviously cannot be an anagram. std::string::length() is a very fast operation (it needs to store the string size anyway), so we save us the hustle of O(N*log(N)) from sorting the two strings.

这篇关于优化经常用字谜功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆