实现“字符串池”保证不移动 [英] Implementing a "string pool" that is guaranteed not to move

查看:153
本文介绍了实现“字符串池”保证不移动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个字符串池对象,我可以反复插入一个字符序列(我使用这个短语的意思是字符串,而不会与std :: string或C字符串混淆),获取一个指针到序列,并保证指针不会变得无效,如果/当池需要增长。使用一个简单的 std :: string 作为池将不工作,因为字符串超出其初始容量时可能被重新分配,从而使所有以前的指针无效进去。

I need a "string pool" object into which I can repeatedly insert a "sequence of chars" (I use this phrase to mean "string" without confusing it with std::string or a C string), obtain a pointer to the sequence, and be guaranteed that the pointer will not become invalidated if/when the pool needs to grow. Using a simple std::string as the pool won't work, because of the possibility for the string to be reallocated when it outgrows its initial capacity, thus invalidating all previous pointers into it.

池不会增长没有约束 - 有明确的点,我将调用 clear()方法 - 但我不想保留任何最大容量,它。

The pool will not grow without bound -- there are well-defined points at which I will call a clear() method on it -- but I don't want to reserve any maximum capacity on it, either. It should be able to grow, without moving.

我考虑的一种可能性是将每个新的字符序列插入 forward_list< string> 并获取 begin() - > c_str()。另一个是插入到 unordered_set< string> ,但是我很难找到当unordered_set必须增长时会发生什么。我考虑的第三种可能性(不太热情)是滚动我自己的1K缓冲区链,其中连接字符序列。这有一个优点(我猜)具有最高的性能,这是这个项目的一个要求。

One possibility I'm considering is inserting each new sequence of chars into a forward_list<string> and obtaining begin()->c_str(). Another is inserting into an unordered_set<string>, but I'm having a hard time finding out what happens when an unordered_set has to grow. The third possibility I'm considering (less enthusiastically) is rolling my own chain of 1K buffers into which I concatenate the sequence of chars. That has the advantage (I guess) of having the highest performance, which is a requirement for this project.

我有兴趣听听别人如何推荐接近这个。

I'd be interested in hearing how others would recommend approaching this.

UPDATE 1:编辑以阐明我对短语字符序列的使用等同于字符串的一般概念

UPDATE 1: edited to clarify my use of the phrase "sequence of chars" to be equivalent to the general notion of a "string" without implying either std::string or null-terminated char array.

推荐答案

我在过去使用过这种方法:

I've used this approach in the past:

using Atom = const char*;

Atom make_atom(string const& value)
{
    static set<string> interned;
    return interned.insert(value).first->c_str();
}

显然,如果你想/需要清除集合,

Obviously, if you want/need to clear the set, you'd make it available in some wider scope.

为了更高效的移动/将字符串置入集合中。

For even more efficiency move/emplace the strings into the set.

更新为了完整性,我添加了此方法。查看 Live on Coliru

Update I've added this approach for completeness. See it Live on Coliru

#include <string>
#include <set>
using namespace std;

using Atom = const char*;

template <typename... Args>
typename enable_if<
    is_constructible<string, Args...>::value, Atom
>::type emplace_atom(Args&&... args)
{
    static set<string> interned;
    return interned.emplace(forward<Args>(args)...).first->c_str();
}

#include <iostream>

int main() {
    cout << emplace_atom("Hello World\n");
    cout << emplace_atom(80, '=');
}

这篇关于实现“字符串池”保证不移动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆