`std :: string`分配是我当前的瓶颈-如何使用自定义分配器进行优化? [英] `std::string` allocations are my current bottleneck - how can I optimize with a custom allocator?

查看:146
本文介绍了`std :: string`分配是我当前的瓶颈-如何使用自定义分配器进行优化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个 C ++ 14 JSON库作为练习和在我的个人项目中使用它.

通过使用 callgrind ,我发现 rapidjson 使用自定义分配器来避免在字符串内存分配过程中调用malloc(...)

我尝试分析了Rapidjson的源代码,但是大量额外的代码和注释,再加上我不确定自己在寻找什么,这一事实并没有太大帮助.

  • 自定义分配器在这种情况下如何提供帮助?
    • 是否在某个地方(静态地?)预先分配了内存缓冲区,并且std::strings从那里获取了可用的内存?
  • 使用自定义分配器的字符串是否与普通字符串兼容"?
    • 它们具有不同的类型.他们必须转换"吗? (这会导致性能下降吗?)

代码注释:

  • Strstd::string的别名.

解决方案

默认情况下,std::string根据需要从与您使用mallocnew进行分配的任何东西相同的堆中分配内存.为了通过提供自己的自定义分配器来获得性能提升,您将需要管理自己的块"内存,以便分配器可以比malloc更快地处理字符串请求的内存量. .您的内存管理器将在幕后相对较少地调用malloc(或new,具体取决于您的方法),一次请求大量"内存,然后处理该(这些)内存块的各个部分( s)通过自定义分配器.为了实际获得比malloc更好的性能,通常必须根据用例的已知分配模式来调整内存管理器.

这种事情通常归结为内存使用与执行速度之间的古老折衷.例如:如果您实际上在字符串大小上有一个已知的上限,则可以使用过度分配的技巧来始终容纳最大的情况.尽管这浪费了您的内存资源,但可以减轻由于内存碎片而导致的更一般化的分配所产生的性能开销.以及出于您的目的而对realloc本质上恒定的时间进行任何调用.

@sehe完全正确.有很多方法.

最后要解决您的第二个问题,使用不同分配器的字符串可以很好地配合使用,用法应透明.

例如:

class myalloc : public std::allocator<char>{};
myalloc customAllocator;

int main(void)
{
  std::string mystring(customAllocator);
  std::string regularString = "test string";
  mystring = regularString;
  std::cout << mystring;

  return 0;
}

这是一个非常愚蠢的示例,当然,在后台使用相同的主力代码.但是,它显示了使用不同类型"的分配器类在字符串之间进行分配.实现一个有用的分配器来提供STL所需的完整接口而不仅仅掩盖默认的std::allocator并不是一件容易的事. 似乎不错涵盖涉及的概念.至少在您的问题上下文中,这样做起作用的关键在于,使用不同的分配器不会导致字符串具有不同的类型.请注意,自定义分配器是作为构造函数的参数而不是模板参数给出的. STL仍然可以使用模板(例如rebindTraits)来使分配器接口和跟踪均匀化.

I'm writing a C++14 JSON library as an exercise and to use it in my personal projects.

By using callgrind I've discovered that the current bottleneck during a continuous value creation from string stress test is an std::string dynamic memory allocation. Precisely, the bottleneck is the call to malloc(...) made from std::string::reserve.

I've read that many existing JSON libraries such as rapidjson use custom allocators to avoid malloc(...) calls during string memory allocations.

I tried to analyze rapidjson's source code but the large amount of additional code and comments, plus the fact that I'm not really sure what I'm looking for, didn't help me much.

  • How do custom allocators help in this situation?
    • Is a memory buffer preallocated somewhere (where? statically?) and std::strings take available memory from it?
  • Are strings using custom allocators "compatible" with normal strings?
    • They have different types. Do they have to be "converted"? (And does that result in a performance hit?)

Code notes:

  • Str is an alias for std::string.

解决方案

By default, std::string allocates memory as needed from the same heap as anything that you allocate with malloc or new. To get a performance gain from providing your own custom allocator, you will need to be managing your own "chunk" of memory in such a way that your allocator can deal out the amounts of memory that your strings ask for faster than malloc does. Your memory manager will make relatively few calls to malloc, (or new, depending on your approach) under the hood, requesting "large" amounts of memory at once, then deal out sections of this (these) memory block(s) through the custom allocator. To actually achieve better performance than malloc, your memory manager will usually have to be tuned based on known allocation patterns of your use cases.

This kind of thing often comes down to the age-old trade off of memory use versus execution speed. For example: if you have a known upper bound on your string sizes in practice, you can pull tricks with over-allocating to always accommodate the largest case. While this is wasteful of your memory resources, it can alleviate the performance overhead that more generalized allocation runs into with memory fragmentation. As well as making any calls to realloc essentially constant time for your purposes.

@sehe is exactly right. There are many ways.

EDIT:

To finally address your second question, strings using different allocators can play nicely together, and usage should be transparent.

For example:

class myalloc : public std::allocator<char>{};
myalloc customAllocator;

int main(void)
{
  std::string mystring(customAllocator);
  std::string regularString = "test string";
  mystring = regularString;
  std::cout << mystring;

  return 0;
}

This is a fairly silly example and, of course, uses the same workhorse code under the hood. However, it shows assignment between strings using allocator classes of "different types". Implementing a useful allocator that supplies the full interface required by the STL without just disguising the default std::allocator is not as trivial. This seems to be a decent write up covering the concepts involved. The key to why this works, in the context of your question at least, is that using different allocators doesn't cause the strings to be of different type. Notice that the custom allocator is given as an argument to the constructor not a template parameter. The STL still does fun things with templates (such as rebind and Traits) to homogenize allocator interfaces and tracking.

这篇关于`std :: string`分配是我当前的瓶颈-如何使用自定义分配器进行优化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆