在C ++中,为什么要重载const char array函数和包裹const char *的私有结构? [英] In C++, why overload a function on a `const char array` and a private struct wrapping a `const char*`?

查看:90
本文介绍了在C ++中,为什么要重载const char array函数和包裹const char *的私有结构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在ENTT库中遇到了一个有趣的类.此类用于计算字符串哈希,如下所示:

I recently ran into a fascinating class in the ENTT library. This class is used to calculate hashes for strings like so:

std::uint32_t hashVal = hashed_string::to_value("ABC");

hashed_string hs{"ABC"};
std::uint32_t hashVal2 = hs.value();

在查看此类的实现时,我注意到没有一个构造函数或hashed_string::to_value成员函数直接采用const char*.取而代之的是,它们采用称为const_wrapper的简单结构.下面是该类的实现的简化视图,以说明这一点:

While looking at the implementation of this class I noticed that the none of the constructors or hashed_string::to_value member functions take a const char* directly. Instead, they take a simple struct called const_wrapper. Below is a simplified view of the class' implementation to illustrate this:

/*
   A hashed string is a compile-time tool that allows users to use
   human-readable identifers in the codebase while using their numeric
   counterparts at runtime
*/
class hashed_string
{
private:

    struct const_wrapper
    {
        // non-explicit constructor on purpose
        constexpr const_wrapper(const char *curr) noexcept: str{curr} {}
        const char *str;
    };

    inline static constexpr std::uint32_t calculateHash(const char* curr) noexcept
    {
        // ...
    }

public:

    /*
       Returns directly the numeric representation of a string.
       Forcing template resolution avoids implicit conversions. An
       human-readable identifier can be anything but a plain, old bunch of
       characters.
       Example of use:
       const auto value = hashed_string::to_value("my.png");
    */
    template<std::size_t N>
    inline static constexpr std::uint32_t to_value(const char (&str)[N]) noexcept
    {
        return calculateHash(str);
    }

    /*
       Returns directly the numeric representation of a string.
       wrapper parameter helps achieving the purpose by relying on overloading.
    */
    inline static std::uint32_t to_value(const_wrapper wrapper) noexcept
    {
        return calculateHash(wrapper.str);
    }

    /*
       Constructs a hashed string from an array of const chars.
       Forcing template resolution avoids implicit conversions. An
       human-readable identifier can be anything but a plain, old bunch of
       characters.
       Example of use:
       hashed_string hs{"my.png"};
    */
    template<std::size_t N>
    constexpr hashed_string(const char (&curr)[N]) noexcept
        : str{curr}, hash{calculateHash(curr)}
    {}

    /*
       Explicit constructor on purpose to avoid constructing a hashed
       string directly from a `const char *`.
       wrapper parameter helps achieving the purpose by relying on overloading.
    */
    explicit constexpr hashed_string(const_wrapper wrapper) noexcept
        : str{wrapper.str}, hash{calculateHash(wrapper.str)}
    {}

    //...

private:
    const char *str;
    std::uint32_t hash;
};

不幸的是,我看不到const_wrapper结构的用途.

Unfortunately I fail to see the purpose of the const_wrapper struct. Does it have something to do with the comment at the top, which states "A hashed string is a compile-time tool..."?

我也不确定出现在模板函数上方的注释的含义,即强制模板解析避免隐式转换".有人能解释吗?

I am also unsure about what the comments that appears above the template functions mean, which state "Forcing template resolution avoids implicit conversions." Is anyone able to explain this?

最后,有趣的是,该类如何由另一个维护以下类型的std::unordered_map的类使用:std::unordered_map<hashed_string, Resource>

Finally, it is interesting to note how this class is used by another class that maintains an std::unordered_map of the following type: std::unordered_map<hashed_string, Resource>

另一个类提供了一个成员函数,可以使用诸如键之类的字符串向地图添加资源.其实现的简化视图如下所示:

This other class offers a member function to add resources to the map using strings like keys. A simplified view of its implementation looks like this:

bool addResource(hashed_string id, Resource res)
{
    // ...
    resourceMap[id] = res;
    // ...
}

我的问题是:使用hashed_strings作为映射的键而不是std :: strings的优点是什么?使用诸如hashed_strings之类的数字类型更有效吗?

My question here is: what is the advantage of using hashed_strings as the keys to our map instead of std::strings? Is it more efficient to work with numeric types like hashed_strings?

感谢您提供任何信息.学习这堂课对我有很多帮助.

Thank you for any information. Studying this class has helped me learn so much.

推荐答案

作者正在尝试帮助您避免重复哈希字符串时发生的意外性能问题.由于哈希字符串很昂贵,因此您可能只想执行一次并将其缓存到某个位置.如果它们具有隐式构造函数,则可以不知不觉地重复哈希相同的字符串.

The author is trying to help you avoid accidental performance problems that happen when you repeatedly hash strings. Since hashing strings is expensive, you probably want to do it once and cache it somewhere. If they have an implicit constructor, you could hash the same string repeatedly without knowing or intending to do so.

因此,该库为字符串文字提供了隐式构造,可以通过constexpr在编译时进行计算,但是对于const char*来说, explicit 构造通常可以通过以下方式进行计算:通常不能在编译时完成,而要避免重复或意外地这样做.

So the library provides implicit construction for string literals, which can be computed at compile-time via constexpr but explicit construction for const char* in general since those can't generally be done at compile-time and you want to avoid doing it repeatedly or accidentally.

考虑:

void consume( hashed_string );

int main()
{
    const char* const s = "abc";
    const auto hs1 = hashed_string{"my.png"}; // Ok - explicit, compile-time hashing
    const auto hs2 = hashed_string{s};        // Ok - explicit, runtime hashing

    consume( hs1 ); // Ok - cached value - no hashing required
    consume( hs2 ); // Ok - cached value - no hashing required

    consume( "my.png" ); // Ok - implicit, compile-time hashing
    consume( s );        // Error! Implicit, runtime hashing disallowed!
                         // Potential hidden inefficiency, so library disallows it.
}

如果我删除最后一行,你可以看到编译器如何应用隐式转换为你在<大骨节病>的 C ++洞察 :

If I remove the last line, you can see how the compiler applies the implicit conversions for you at C++ Insights:

consume(hashed_string(hs1));
consume(hashed_string(hs2));
consume(hashed_string("my.png"));

但是由于隐式/显式构造函数,因此拒绝对consume(s)行执行此操作.

But it's refusing to do so for the line consume(s) because of the implict/explicit constructors.

但是请注意,这种保护用户的尝试并非万无一失.如果将字符串声明为数组而不是指针,则可能会意外地重新哈希:

Note, however, this attempt at protecting the user isn't foolproof. If you declare your string as an array rather than as a pointer, you can accidentally re-hash:

const char s[100] = "abc";
consume( s );  // Compiles BUT it's doing implicit, runtime hashing. Doh.

// Decay 's' back to a pointer, and the library's guardrails return
const auto consume_decayed = []( const char* str ) { consume( str ); }
consume_decayed( s ); // Error! Implicit, runtime hashing disallowed!

这种情况不太常见,此类数组通常在传递给其他函数时被衰减为指针,然后其他函数将如上所述起作用. 可以想象,该库可以使用if constexpr等对字符串文字强制执行编译时散列,而对于上面的s这样的非文字数组则禁止使用它. (有您的拉取请求,要还给图书馆!) [查看评论.]

This case is less common, and such arrays typically get decayed into pointers as they are passed to other functions, which would then behave as above. The library could conceivably enforce compile-time hashing for string literals with if constexpr and the like and forbid it for non-literal arrays like s above. (There's your pull request to give back to the library!) [See comments.]

要回答您的最后一个问题:这样做的原因是为了使基于哈希的容器(如std::unordered_map)具有更快的性能.通过一次计算散列并将其缓存在hashed_string中,可以最大程度地减少您必须执行的散列数.现在,在地图中进行键查找只需要比较键和查找字符串的预先计算的哈希值.

To answer your final question: The reasons for doing this are to have faster performance for hash-based containers like std::unordered_map. It minimizes the number of hashes you have to do by computing the hash once and caching it inside the hashed_string. Now, a key lookup in the map just has to compare the pre-computed hash values of the keys and the lookup string.

这篇关于在C ++中,为什么要重载const char array函数和包裹const char *的私有结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆