C ++ unordered_map< string,...>查找而不构造字符串 [英] C++ unordered_map<string, ...> lookup without constructing string

查看:226
本文介绍了C ++ unordered_map< string,...>查找而不构造字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有C ++代码,用于研究BIG字符串并匹配许多子字符串。尽可能避免通过编码像这样的子字符串来构造std :: strings:

I have C++ code that investigates a BIG string and matches lots of substrings. As much as possible, I avoid constructing std::strings, by encoding substrings like this:

char* buffer, size_t bufferSize

但是,在某些时候,我想在其中一个子字符串中查找:

At some point, however, I'd like to look up a substring in one of these:

std::unordered_map<std::string, Info> stringToInfo = {...

因此,我要这样做:

stringToInfo.find(std::string(buffer, bufferSize))

仅出于查找目的构造一个std :: string。

That constructs a std::string for the sole purpose of the lookup.

我觉得我可以做一个优化在这里,通过...将unordered_map的键类型更改为某种临时字符串冒名顶替者,像这样的类...

I feel like there's an optimization I could do here, by... changing the key-type of the unordered_map to some kind of temporary string imposter, a class like this...

class SubString
{
    char* buffer;
    size_t bufferSize;

    // ...
};

...与std :: string具有相同的逻辑,以进行哈希和比较,但随后没有

... that does the same logic as std::string to hash and compare, but then doesn't deallocate its buffer when it's destroyed.

所以,我的问题是:有没有办法让标准类做到这一点,或者我自己写这个类? ?

So, my question is: is there a way to get the standard classes to do this, or do I write this class myself?

推荐答案

您要执行的操作称为异构查询。从C ++ 14开始, std :: map支持:: find std :: set :: find (请注意函数的版本(3)和(4),它们以查找值类型为模板)。对于无序的容器而言,情况更加复杂,因为需要告知它们或为所有键类型找到哈希函数,这些键类型将为相同的文本产生相同的哈希值。正在考虑一项有关未来标准的建议: http ://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0919r0.html

What you're wanting to do is called heterogeneous lookup. Since C++14 it's been supported for std::map::find and std::set::find (note versions (3) and (4) of the functions, which are templated on the lookup value type). It's more complicated for unordered containers because they need to be told of or find hash functions for all key types that will produce the same hash value for the same text. There's a proposal under consideration for a future Standard: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0919r0.html

同时,您可以使用另一个已经支持异构查找的库,例如 boost :: unordered_map :: find

Meanwhile, you could use another library that already supports heterogenous lookup, e.g. boost::unordered_map::find.

如果您要坚持使用 std :: unordered_map ,您可以通过将 std :: string 成员与您的 unordered_map 一起存储来避免创建太多的字符串临时对象值,然后将该字符串传递给查找。您可以将其封装在自定义容器类中。

If you want to stick to std::unordered_map, you could avoid creating so many string temporaries by storing a std::string member alongside your unordered_map that you can reassign values to, then pass that string to find. You could encapsulate this in a custom container class.

另一种方法是编写一个自定义类以用作无序容器键:

Another route is to write a custom class to use as your unordered container key:

struct CharPtrOrString
{
    const char* p_;
    std::string s_;

    explicit CharPtrOrString(const char* p) : p_{p} { }
    CharPtrOrString(std::string s) : p_{nullptr}, s_{std::move(s)} { }

    bool operator==(const CharPtrOrString& x) const
    {
        return p_ ? x.p_ ? std::strcmp(p_, x.p_) == 0
                         : p_ == x.s_
                  : x.p_ ? s_ == x.p_
                         : s_ == x.s_;
    }

    struct Hash
    {
        size_t operator()(const CharPtrOrString& x) const
        {
            std::string_view sv{x.p_ ? x.p_ : x.s_.c_str()};
            return std::hash<std::string_view>()(sv);
        } 
    };
};

然后可以从<$构造 CharPtrOrString c $ c> std :: string 用于无序容器密钥,但是每次调用时,都可以从 const char * 中廉价地构造一个查找。请注意,上面的 operator == 必须确定您做了什么(使用的惯例是,如果指针的 nullptr ,则 std :: string 成员正在使用中),以便比较使用中的成员。哈希函数必须确保具有特定文本值的 std :: string 会产生与 const char * (在GCC 7.3和/或Clang 6中默认为)-我与两者都一起工作,并且记住其中一个有问题,但没有一个。)

You can then construct CharPtrOrString from std::strings for use in the unordered container keys, but construct one cheaply from your const char* each time you call find. Note that operator== above has to work out which you did (convention used is that if the pointer's nullptr then the std::string member's in use) so it compares the in-use members. The hash function has to make sure a std::string with a particular textual value will produce the same hash as a const char* (which it doesn't by default with GCC 7.3 and/or Clang 6 - I work with both and remember one had an issue but not which).

这篇关于C ++ unordered_map&lt; string,...&gt;查找而不构造字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆