uninitialized_copy memcpy/memmove 优化 [英] uninitialized_copy memcpy/memmove optimization

查看:40
本文介绍了uninitialized_copy memcpy/memmove 优化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近开始研究 MSVC 实现中的 STL.那里有一些不错的技巧,但我不知道为什么使用以下标准.

I've recently started to examine the STL in the MSVC's implementation. There are some nice tricks there, however I don't know why the following criteria is used.

如果满足某些条件,std::uninitialized_copy 会被优化为一个简单的 memcpy/memmove.据我了解,如果目标类型 U is_trivially_copy_constructible 来自源类型 T,则输入范围可以 memcpy'd 到未初始化的区域.

The std::uninitialized_copy is optimized to a simple memcpy/memmove if some conditions are met. As my understanding the input range can be memcpy'd to the uninitialized area if the target type U is_trivially_copy_constructible from source type T.

然而,MSVC 实现在选择 memcpy 而不是逐一复制构造元素之前会检查很多事情.我不想在这里粘贴相关代码,如果有人感兴趣,我会通过 pastebin 分享它:https://pastebin.com/Sa4Q7Qj0

However the MSVC implementation checks a hell lot of thing before choosing the memcpy instead of the one-by-one copy-constructing of elements. I did not want to paste the related code here, instead I'm sharing it through pastebin if anyone is interested: https://pastebin.com/Sa4Q7Qj0

uninitialized_copy 的基本算法是这样的(为了可读性省略了异常处理)

The base algorithm for the uninitialized_copy is something like this (exception-handling is omitted for readibility)

template <typename T, typename... Args>
inline void construct_in_place(T& obj, Args&&... args)
{
    ::new (static_cast<void*>(addressof(obj)) T(forward<Args>(args)...);
}

template <typename In, typename Out>
inline Out uninitialized_copy(In first, In last, Out dest)
{
    for (; first != last; ++first, ++dest)
        construct_in_place(*dest, *first);
}

这可以优化为 memcpy/memmove 如果复制构造不做任何特殊"的事情(简单的复制构造).

This can be optimized to a memcpy/memmove if the copy-constructing doesn't do any 'special' thing (trivially copy-constructible).

MS 的实现要求如下:

The MS's implementation requires the following:

  • T 可以简单地分配给 U
  • T 可以简单地复制到 U
  • T 是微不足道的
  • 额外检查(如 sizeof(T) == sizeof(U))如果 T != U

例如以下结构不能被memcpy'd:

So for example the following struct cannot be memcpy'd:

struct Foo
{
    int i;
    Foo() : i(10) { }
};

但以下是可以的:

struct Foo
{
    int i;
    Foo() = default; // or simply omit
};

检查类型 U 是否可以从类型 T 简单地复制构造是否足够?因为这就是 uninitialized_copy 所做的.

Shouldn't it be enough to check if type U can be trivially copy-constructed from type T? Because all that's the uninitialized_copy does.

例如,我不明白为什么 MS 的 STL 实现没有 memcpy'd(注意:我知道原因,它是用户定义的构造函数,但我不明白它背后的逻辑):

For example, I can't see why the following is not memcpy'd by the MS's STL implementation (NOTE: I know the reason, it is the user-defined constructor, but I don't understand the logic behind it):

struct Foo
{
    int i;

    Foo() noexcept
        : i(10)
    {
    }

    Foo(const Foo&) = default;
};

void test()
{
    // please forgive me...
    uint8 raw[256];
    Foo* dest = (Foo*)raw;
    Foo src[] = { Foo(), Foo() };

    bool b = std::is_trivially_copy_constructible<Foo>::value;  // true
    bool b2 = std::is_trivially_copyable<Foo>::value;           // true

    memcpy(dest, src, sizeof(src)); // seems ok

    // uninitialized_copy does not use memcpy/memmove, it calls the copy-ctor one-by-one
    std::uninitialized_copy(src, src + sizeof(src) / sizeof(src[0]), dest);
}

相关 SO 帖子:为什么 gcc 在 std 中不使用 memmove::uninitialized_copy?

正如@Igor Tandetnik 在评论中指出的那样,如果没有用户定义的复制构造函数,那么 T 类型是可简单复制构造的,这是不安全的.他提供了以下示例:

As @Igor Tandetnik pointed out in the comments, it is not safe to say if there is no user-defined copy constructor then the type T is trivially copy-constructible. He provided the following example:

struct Foo
{
    std::string data;
};

在这个例子中,没有用户定义的复制构造函数,它仍然不是普通的复制构造函数.谢谢指正,我根据反馈修改了原帖.

In this example, there is no user-defined copy constructor and it is still not trivially copy-constructible. Thank you for the correction, I modified the original post based on the feedback.

推荐答案

uninitialized_copy 有两个职责:首先,它必须确保正确的位模式进入目标缓冲区.其次,它必须启动该缓冲区中 C++ 对象的生命周期.也就是说,它必须调用某种构造函数,除非 C++ 标准特别授予它跳过该构造函数调用的权限.

uninitialized_copy has two responsibilities: First, it has to make sure that the right bit-pattern gets into the destination buffer. Second, it has to start the lifetime of the C++ objects in that buffer. That is, it must call a constructor of some kind, unless the C++ Standard specifically grants it permission to skip that constructor call.

根据我非常不完整的研究,现在似乎只有 可简单复制 类型保证其位模式由 memcpy/memmove 保留;memcpying 任何其他类型的类型(即使它恰好是可简单复制构造和/或可简单复制赋值!)正式产生未定义的行为.

According to my very incomplete research, it appears that right now only trivially copyable types are guaranteed to have their bit patterns preserved by memcpy/memmove; memcpying any other kind of type (even if it happens to be trivially copy-constructible and/or trivially copy-assignable!) formally produces undefined behavior.

此外,现在似乎只有 平凡 类型可以突然出现"而无需构造函数调用.(P0593隐式创建对象..." 在这方面提出了很多改变,可能是在 C++2b 中.)

And furthermore, it appears that right now only trivial types can "pop into existence" without a constructor call. (P0593 "Implicit creation of objects..." proposes a lot of changes in this area, maybe in C++2b.)

Jonathan Wakely 对 libstdc++ bug 68350 的评论似乎表明GNU libstdc++ 试图通过永远不弹出"任何非平凡类型的对象来保持法律条文——尽管作为 C++ 实现,它们确实有自由以性能的名义利用特定于平台的行为.我猜想 MSVC 遵循类似的逻辑,出于类似的原因(无论这些原因是什么).

Jonathan Wakely's comment on libstdc++ bug 68350 seems to indicate that GNU libstdc++ is trying to remain within the letter of the law by never "popping into existence" any objects of non-trivial type — even though, as a C++ implementation, they do have latitude to exploit platform-specific behavior in the name of performance. I would guess that MSVC is following similar logic, for similar reasons (whatever those reasons are).

通过比较他们在类类型上优化 std::copystd::uninitialized_copy 的意愿,您可以看出供应商不愿意弹出对象"是简单的可复制但不是微不足道的".简单的可复制意味着 std::copy 可以使用 memcpy 来分配现有对象;但是 std::uninitialized_copy,为了让这些对象首先出现,仍然觉得需要在循环中调用 some 构造函数——即使它是平凡的副本构造函数!

You can see the vendors' unwillingness to "pop objects into existence" by comparing their willingness to optimize std::copy versus std::uninitialized_copy on class types which are "trivially copyable but not trivial." Being trivially copyable means std::copy can use memcpy to assign-over-existing-objects; but std::uninitialized_copy, to make those objects pop into existence in the first place, still feels the need to call some constructor in a loop — even if it's the trivial copy constructor!

class C { int i; public: C() = default; };
class D { int i; public: D() {} };
static_assert(std::is_trivially_copyable_v<C> && !std::is_aggregate_v<C>);
static_assert(std::is_trivially_copyable_v<D> && !std::is_aggregate_v<D>);

void copyCs(C *p, C *q, int n) {
    std::copy(p, p+n, q);  // GNU and MSVC both optimize
    std::uninitialized_copy(p, p+n, q);  // GNU and MSVC both optimize
}
void copyDs(D *p, D *q, int n) {
    std::copy(p, p+n, q);  // GNU and MSVC both optimize
    std::uninitialized_copy(p, p+n, q);  // neither GNU nor MSVC optimizes :(
}

<小时>

你写道:


You wrote:

检查类型 U 是否可以从类型 T 简单地复制构造是否足够?因为这就是 uninitialized_copy 所做的.

Shouldn't it be enough to check if type U can be trivially copy-constructed from type T? Because that's all uninitialized_copy does.

是的,但是当 T 和 U 不同时,您不是在进行简单的复制构造";你正在做一个微不足道的构造",它不是复制构造.不幸的是,C++ 标准将 is_trivially_constructible 定义为与人类所说的平凡"不同的东西!我的博客文章 Trivially-constructible-from"(2018 年 7 月)) 给出了这个例子:

Yes, but when T and U are different, you're not doing "trivial copy-construction"; you're doing a "trivial construction" that is not copy-construction. And unfortunately the C++ Standard defines is_trivially_constructible<T,U> to mean something different from what humans mean by "trivial"! My blog post "Trivially-constructible-from" (July 2018) gives this example:

assert(is_trivially_constructible_v<u64, u64b>);
// Yay!

using u16 = short;
assert(is_trivially_constructible_v<u64, u16>);
// What the...

assert(is_trivially_constructible_v<u64, double>);
// ...oh geez.

这解释了一些 MSVC

This explains some of MSVC's

额外检查(如 sizeof(T) == sizeof(U))如果 T != U

extra checks (like sizeof(T) == sizeof(U)) if T != U

具体来说,MSVC 的 norefer_nofolt*,U*>::_Really_trivial trait 依赖于那些额外的检查来检测一些(但不是全部)常见情况,其中从 T 到 U 的转换在人类/按位中真的"微不足道意义,而不仅仅是 C++ 标准意义上的微不足道.这允许 MSVC 优化将 int* 数组复制到 const int* 数组中,这是 libstdc++ 无法做到的:

Specifically, MSVC's _Ptr_cat_helper<T*,U*>::_Really_trivial trait relies on those extra checks to detect some (but not all) common situations where the conversion from T to U is "really" trivial in the human/bitwise sense, and not just trivial in the C++-Standard sense. This allows MSVC to optimize copying an array of int* into an array of const int*, which is something libstdc++ can't do:

using A = int*;
using B = const int*;

void copyAs(A *p, B *q, int n) {
    std::uninitialized_copy(p, p+n, q);  // only MSVC optimizes
}
void copyBs(B *p, B *q, int n) {
    std::uninitialized_copy(p, p+n, q);  // GNU and MSVC both optimize
}

这篇关于uninitialized_copy memcpy/memmove 优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆