我可以使用memcpy写入多个相邻的“标准布局"子对象吗? [英] Can I use memcpy to write to multiple adjacent Standard Layout sub-objects?

查看:67
本文介绍了我可以使用memcpy写入多个相邻的“标准布局"子对象吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

免责声明:这是尝试深入研究更大的问题,因此请不要担心该示例在实践中是否有意义.

是的,如果要复制对象,请使用/提供复制构造函数. (但是请注意,即使该示例也不会复制整个对象;它会尝试在一些相邻的(Q.2)整数上增加内存.)


给出C ++ 标准版式 struct,我可以使用memcpy一次写入多个(相邻)子对象?

完整示例:( https://ideone.com/1lP2Gd https://ideone.com/YXspBk )

#include <vector>
#include <iostream>
#include <assert.h>
#include <inttypes.h>
#include <stddef.h>
#include <memory.h>

struct MyStandardLayout {
    char mem_a;
    int16_t num_1;
    int32_t num_2;
    int64_t num_3;
    char mem_z;

    MyStandardLayout()
    : mem_a('a')
    , num_1(1 + (1 << 14))
    , num_2(1 + (1 << 30))
    , num_3(1LL + (1LL << 62))
    , mem_z('z')
    { }

    void print() const {
        std::cout << 
            "MySL Obj: " <<
            mem_a << " / " <<
            num_1 << " / " <<
            num_2 << " / " <<
            num_3 << " / " <<
            mem_z << "\n";
    }
};

void ZeroInts(MyStandardLayout* pObj) {
    const size_t first = offsetof(MyStandardLayout, num_1);
    const size_t third = offsetof(MyStandardLayout, num_3);
    std::cout << "ofs(1st) =  " << first << "\n";
    std::cout << "ofs(3rd) =  " << third << "\n";
    assert(third > first);
    const size_t delta = third - first;
    std::cout << "delta =  " << delta << "\n";
    const size_t sizeAll = delta + sizeof(MyStandardLayout::num_3);
    std::cout << "sizeAll =  " << sizeAll << "\n";

    std::vector<char> buf( sizeAll, 0 );
    memcpy(&pObj->num_1, &buf[0], sizeAll);
}

int main()
{
    MyStandardLayout obj;
    obj.print();
    ZeroInts(&obj);
    obj.print();

    return 0;
}

给出 C ++标准中的措辞:

9.2个班级成员

...

13 分配了具有相同访问控制(第11条)的(非联盟)类的非静态数据成员,以便以后的成员具有 类对象中的更高地址. (...)实现对齐要求可能会导致两个 相邻成员之间不得立即分配; (...)

我得出的结论是,可以保证num_1num_3具有递增的地址,并且是相邻的模填充.

要完全定义上面的示例,我看到了这些要求,但我不确定它们是否满足:

    必须允许
  • memcpy以这种方式立即写入多个内存对象",即

    这些属性成立吗?我还想念其他东西吗?

    解决方案

    将此作为部分答案. memcpy(&num_1, buf, sizeAll):

    注意:詹姆斯的答案更加简洁明了.

    我问:

      必须允许
    • memcpy以这种方式立即写入多个内存对象",即

      • 调用目标地址为num_1且其大小大于num_1对象"的大小的memcpy是合法的.
      • [C ++(14)标准] [2] AFAICT将对memcpy的描述引用到[C99标准] [3],其中一个陈述为:

      7.21.2.1 memcpy函数

      2 memcpy函数从s2指向的对象中复制n个字符 放入 s1指向的对象.

      所以对我来说,这里的问题是.这就是根据C或C ++是否可以将此处具有的目标范围视为对象" 标准.

    我在C标准中发现了更多的思考和搜索内容

    第6.2.6节类型的表示形式

    §6.2.6.1常规

    2 除位域外,对象由一个或多个字节的连续序列,对象的编号,顺序和编码组成. 可以是明确指定的,也可以是实现定义的.

    因此,至少暗示对象" =>连续的字节序列".

    我并不是那么大胆地​​宣称倒数(连续的字节序列" =>一个对象")成立,但至少对象"似乎在这里没有更严格的定义.

    然后,如Q中所引用的,C ++标准的§9.2/13(和§1.8/5)似乎保证我们 do 具有连续的字节序列(包括填充). /p>

    然后,第3.9/3节说:

    3 对于任何普通可复制的类型T,如果两个指向T的指针指向 不同的T对象obj1和obj2,其中obj1和obj2都不是 基本类子对象,如果组成obj1的基础字节(1.7)是 复制到obj2中,obj2随后应具有与obj1相同的值. [示例:

    T* t1p;
    T* t2p;       
         // provided that t2p points to an initialized object ...         
    std::memcpy(t1p, t2p, sizeof(T));  
         // at this point, every subobject of trivially copyable type in *t1p contains        
         // the same value as the corresponding subobject in *t2p
    

    -结束示例]

    因此,这明确允许将memcpy应用到普通可复制类型的整个对象.

    在该示例中,三个成员包括一个可简单复制的子对象",实际上我认为将它们包装在不同类型的实际子对象中仍然会为显式对象强制要求与三个成员完全相同的内存布局:

    struct MyStandardLayout_Flat {
        char mem_a;
        int16_t num_1;
        int32_t num_2;
        int64_t num_3;
        char mem_z;
    };
    
    struct MyStandardLayout_Sub {
        int16_t num_1;
        int32_t num_2;
        int64_t num_3;
    };
    
    struct MyStandardLayout_Composite {
        char mem_a;
        // Note that the padding here is different from the padding in MyStandardLayout_Flat, but that doesn't change how num_* are layed out.
        MyStandardLayout_Sub nums;
        char mem_z;
    };
    

    _Composite中的nums的内存布局和_Flat的三个成员的布局应完全相同,因为适用相同的基本规则.

    因此最后,假设子对象" num_1至num_3将由等效的连续字节序列表示为完整的Trivially Copyable子对象,我:

    • 很难想象一个实现的方法或优化器,
    • 可以说可以是:
      • 读取为未定义行为, iff ,我们得出结论,C ++§3.9/3表示普通可复制类型的 only (完整)对象是允许由memcpy这样处理,或者从C99§6.2.6.1/2和memcpy 7.21.2.1的一般规范得出结论,num_ *字节的连续序列不包含针对该对象的有效对象"记忆复制的目的.
      • 读取为定义的行为, iff ,我们得出结论,C ++§3.9/3并未规范地将memcpy的适用性限制为其他类型或内存范围,并得出以下结论: C99标准中的memcpy(和对象项")的定义允许将相邻变量视为单个对象连续字节目标.

    Disclaimer: This is trying to drill down on a larger problem, so please don't get hung up with whether the example makes any sense in practice.

    And, yes, if you want to copy objects, please use / provide the copy-constructor. (But note how even the example does not copy a whole object; it tries to blit some memory over a few adjacent(Q.2) integers.)


    Given a C++ Standard Layout struct, can I use memcpy to write to multiple (adjacent) sub-objects at once?

    Complete example: ( https://ideone.com/1lP2Gd https://ideone.com/YXspBk)

    #include <vector>
    #include <iostream>
    #include <assert.h>
    #include <inttypes.h>
    #include <stddef.h>
    #include <memory.h>
    
    struct MyStandardLayout {
        char mem_a;
        int16_t num_1;
        int32_t num_2;
        int64_t num_3;
        char mem_z;
    
        MyStandardLayout()
        : mem_a('a')
        , num_1(1 + (1 << 14))
        , num_2(1 + (1 << 30))
        , num_3(1LL + (1LL << 62))
        , mem_z('z')
        { }
    
        void print() const {
            std::cout << 
                "MySL Obj: " <<
                mem_a << " / " <<
                num_1 << " / " <<
                num_2 << " / " <<
                num_3 << " / " <<
                mem_z << "\n";
        }
    };
    
    void ZeroInts(MyStandardLayout* pObj) {
        const size_t first = offsetof(MyStandardLayout, num_1);
        const size_t third = offsetof(MyStandardLayout, num_3);
        std::cout << "ofs(1st) =  " << first << "\n";
        std::cout << "ofs(3rd) =  " << third << "\n";
        assert(third > first);
        const size_t delta = third - first;
        std::cout << "delta =  " << delta << "\n";
        const size_t sizeAll = delta + sizeof(MyStandardLayout::num_3);
        std::cout << "sizeAll =  " << sizeAll << "\n";
    
        std::vector<char> buf( sizeAll, 0 );
        memcpy(&pObj->num_1, &buf[0], sizeAll);
    }
    
    int main()
    {
        MyStandardLayout obj;
        obj.print();
        ZeroInts(&obj);
        obj.print();
    
        return 0;
    }
    

    Given the wording in the C++ Standard:

    9.2 Class Members

    ...

    13 Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object. (...) Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; (...)

    I would conclude that it is guaranteed that num_1 to num_3 have increasing addresses and are adjacent modulo padding.

    For the above example to be fully defined, I see these requirements, of which I am not sure they hold:

    • memcpy must be allowed to write to multiple "memory objects" in this way at once, i.e. specifically

      • Calling memcpy with the target address of num_1 and a size that is larger than the size of the num_1 "object" is legal. (Given that num_1 is not part of an array.) (Is memcpy(&a + 1, &b + 1, 0) defined in C11? seems a good related question, but doesn't quite fit.)
      • The C++ (14) Standard, AFAICT, refers description of memcpy to the C99 Standard, and that one states:

      7.21.2.1 The memcpy function

      2 The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.

      So for me the question here wrt. this is whether the target range we have here can be considered "an object" according to the C or C++ Standard. Note: A (part of an) array of chars, declared and defined as such, certainly can be assumed to count as "an object" for the purposes of memcpy because I'm pretty sure I'm allowed to copy from one part of a char array to another part of (another) char array.

      So then the question would be if it is legal to reinterpret the memory range of the three members as a "conceptual"(?) char array.

    • Calculating sizeAll is legal, that is usage of offsetof is legal as shown.

    • Writing to the padding in between the members is legal.

    Do these properties hold? Have I missed anything else?

    解决方案

    Putting this as a partial answer wrt. memcpy(&num_1, buf, sizeAll):

    Note: James' answer is much more concise and definitive.

    I asked:

    • memcpy must be allowed to write to multiple "memory objects" in this way at once, i.e. specifically

      • Calling memcpy with the target address of num_1 and a size that is larger than the size of the num_1 "object" is legal.
      • The [C++ (14) Standard][2], AFAICT, refers description of memcpy to the [C99 Standard][3], and that one states:

      7.21.2.1 The memcpy function

      2 The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.

      So for me the question here wrt. this is whether the target range we have here can be considered "an object" according to the C or C++ Standard.

    Thinking and searching a bit more, I found in the C Standard:

    § 6.2.6 Representations of types

    § 6.2.6.1 General

    2 Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined.

    So at least it is implied that "an object" => "contiguous sequence of bytes".

    I'm not so bold to claim that the inverse -- "contiguous sequence of bytes" => "an object" -- holds, but at least "an object" doesn't seem to be defined more strictly here.

    Then, as quoted in Q, §9.2/13 of the C++ Standard (and § 1.8/5) seem to guarantee that we do have a contiguous sequence of bytes (including padding).

    Then, §3.9/3 says:

    3 For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2, obj2 shall subsequently hold the same value as obj1. [ Example:

    T* t1p;
    T* t2p;       
         // provided that t2p points to an initialized object ...         
    std::memcpy(t1p, t2p, sizeof(T));  
         // at this point, every subobject of trivially copyable type in *t1p contains        
         // the same value as the corresponding subobject in *t2p
    

    —end example ]

    So this explicitly allows the application of memcpy to whole objects of Trivially Copyable types.

    In the example, the three members comprise a "trivially copyable sub-object", and indeed I think wrapping them in an actual subobject of distinct type would still mandate exactly the same memory layout for the explicit object as for the three members:

    struct MyStandardLayout_Flat {
        char mem_a;
        int16_t num_1;
        int32_t num_2;
        int64_t num_3;
        char mem_z;
    };
    
    struct MyStandardLayout_Sub {
        int16_t num_1;
        int32_t num_2;
        int64_t num_3;
    };
    
    struct MyStandardLayout_Composite {
        char mem_a;
        // Note that the padding here is different from the padding in MyStandardLayout_Flat, but that doesn't change how num_* are layed out.
        MyStandardLayout_Sub nums;
        char mem_z;
    };
    

    The memory layout of nums in _Composite and the three members of _Flat should be layed out completely the same, because the same basic rules apply.

    So in conclusion, given that the "sub object" num_1 to num_3 will be represented by an equivalent contiguous sequence of bytes as a full Trivially Copyable sub-object, I:

    • have a very, very hard time imagining an implementation or optimizer that breaks this
    • Would say it either can be:
      • read as Undefined Behavior, iff we conclude that C++§3.9/3 implies that only (full) objects of Trivially Copyable Type are allowed to be be treated thusly by memcpy or conclude from C99§6.2.6.1/2 and the general spec of memcpy 7.21.2.1 that the contiguous sequence of the num_* bytes does not comprise a "valid object" for the purposes of memcopy.
      • read as Defined Behavior, iff we conclude that C++§3.9/3 does not normatively limit the applicability of memcpy to other types or memory ranges and conclude that the definition of memcpy (and the "object term") in the C99 Standard allows to treat adjacent variables as a single object contiguous bytes target.

    这篇关于我可以使用memcpy写入多个相邻的“标准布局"子对象吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆