我可以使用memcpy写入多个相邻的“标准布局"子对象吗? [英] Can I use memcpy to write to multiple adjacent Standard Layout sub-objects?
问题描述
免责声明:这是尝试深入研究更大的问题,因此请不要担心该示例在实践中是否有意义.
是的,如果要复制对象,请使用/提供复制构造函数. (但是请注意,即使该示例也不会复制整个对象;它会尝试在一些相邻的(Q.2)整数上增加内存.)
给出C ++ 标准版式 struct
,我可以使用memcpy
一次写入多个(相邻)子对象?
完整示例:( https://ideone.com/1lP2Gd https://ideone.com/YXspBk )
#include <vector>
#include <iostream>
#include <assert.h>
#include <inttypes.h>
#include <stddef.h>
#include <memory.h>
struct MyStandardLayout {
char mem_a;
int16_t num_1;
int32_t num_2;
int64_t num_3;
char mem_z;
MyStandardLayout()
: mem_a('a')
, num_1(1 + (1 << 14))
, num_2(1 + (1 << 30))
, num_3(1LL + (1LL << 62))
, mem_z('z')
{ }
void print() const {
std::cout <<
"MySL Obj: " <<
mem_a << " / " <<
num_1 << " / " <<
num_2 << " / " <<
num_3 << " / " <<
mem_z << "\n";
}
};
void ZeroInts(MyStandardLayout* pObj) {
const size_t first = offsetof(MyStandardLayout, num_1);
const size_t third = offsetof(MyStandardLayout, num_3);
std::cout << "ofs(1st) = " << first << "\n";
std::cout << "ofs(3rd) = " << third << "\n";
assert(third > first);
const size_t delta = third - first;
std::cout << "delta = " << delta << "\n";
const size_t sizeAll = delta + sizeof(MyStandardLayout::num_3);
std::cout << "sizeAll = " << sizeAll << "\n";
std::vector<char> buf( sizeAll, 0 );
memcpy(&pObj->num_1, &buf[0], sizeAll);
}
int main()
{
MyStandardLayout obj;
obj.print();
ZeroInts(&obj);
obj.print();
return 0;
}
给出 C ++标准中的措辞:
9.2个班级成员
...
13 分配了具有相同访问控制(第11条)的(非联盟)类的非静态数据成员,以便以后的成员具有 类对象中的更高地址. (...)实现对齐要求可能会导致两个 相邻成员之间不得立即分配; (...)
我得出的结论是,可以保证num_1
至num_3
具有递增的地址,并且是相邻的模填充.
要完全定义上面的示例,我看到了这些要求,但我不确定它们是否满足:
-
必须允许
-
memcpy
以这种方式立即写入多个内存对象",即- 以目标地址
num_1
和大于num_1
对象"的大小的大小调用memcpy
是合法的. (鉴于num_1
不是数组的一部分.)(在C11中定义了memcpy(& a + 1,& b + 1,0)吗?似乎是一个很好的相关问题,但不太适合.) - C ++(14)标准(AFAICT)将对
memcpy
的描述引用到memcpy的目的,声明并定义为肯定的字符数组(的一部分)可以假定为对象",因为我敢肯定允许从一个char数组的一部分复制到(另一个)char数组的另一部分.那么然后问题是将三个成员的内存范围重新解释为概念"(?)char数组是否合法.
-
计算
sizeAll
是合法的,即使用offsetof
如图所示是合法的. -
在成员之间填充文字是合法的.
这些属性成立吗?我还想念其他东西吗?
解决方案将此作为部分答案.
memcpy(&num_1, buf, sizeAll)
:注意:詹姆斯的答案更加简洁明了.
我问:
-
必须允许
-
memcpy
以这种方式立即写入多个内存对象",即- 调用目标地址为
num_1
且其大小大于num_1
对象"的大小的memcpy
是合法的. - [C ++(14)标准] [2] AFAICT将对
memcpy
的描述引用到[C99标准] [3],其中一个陈述为:
7.21.2.1 memcpy函数
2 memcpy函数从s2指向的对象中复制n个字符 放入 s1指向的对象.
所以对我来说,这里的问题是.这就是根据C或C ++是否可以将此处具有的目标范围视为对象" 标准.
- 调用目标地址为
我在C标准中发现了更多的思考和搜索内容
第6.2.6节类型的表示形式
§6.2.6.1常规
2 除位域外,对象由一个或多个字节的连续序列,对象的编号,顺序和编码组成. 可以是明确指定的,也可以是实现定义的.
因此,至少暗示对象" =>连续的字节序列".
我并不是那么大胆地宣称倒数(连续的字节序列" =>一个对象")成立,但至少对象"似乎在这里没有更严格的定义.
然后,如Q中所引用的,C ++标准的§9.2/13(和§1.8/5)似乎保证我们 do 具有连续的字节序列(包括填充). /p>
然后,第3.9/3节说:
3 对于任何普通可复制的类型T,如果两个指向T的指针指向 不同的T对象obj1和obj2,其中obj1和obj2都不是 基本类子对象,如果组成obj1的基础字节(1.7)是 复制到obj2中,obj2随后应具有与obj1相同的值. [示例:
T* t1p; T* t2p; // provided that t2p points to an initialized object ... std::memcpy(t1p, t2p, sizeof(T)); // at this point, every subobject of trivially copyable type in *t1p contains // the same value as the corresponding subobject in *t2p
-结束示例]
因此,这明确允许将
memcpy
应用到普通可复制类型的整个对象.在该示例中,三个成员包括一个可简单复制的子对象",实际上我认为将它们包装在不同类型的实际子对象中仍然会为显式对象强制要求与三个成员完全相同的内存布局:
struct MyStandardLayout_Flat { char mem_a; int16_t num_1; int32_t num_2; int64_t num_3; char mem_z; }; struct MyStandardLayout_Sub { int16_t num_1; int32_t num_2; int64_t num_3; }; struct MyStandardLayout_Composite { char mem_a; // Note that the padding here is different from the padding in MyStandardLayout_Flat, but that doesn't change how num_* are layed out. MyStandardLayout_Sub nums; char mem_z; };
_Composite
中的nums
的内存布局和_Flat
的三个成员的布局应完全相同,因为适用相同的基本规则.因此最后,假设子对象" num_1至num_3将由等效的连续字节序列表示为完整的Trivially Copyable子对象,我:
- 很难想象一个实现的方法或优化器,
- 可以说可以是:
- 读取为未定义行为, iff ,我们得出结论,C ++§3.9/3表示普通可复制类型的 only (完整)对象是允许由
memcpy
这样处理,或者从C99§6.2.6.1/2和memcpy
7.21.2.1的一般规范得出结论,num_ *字节的连续序列不包含针对该对象的有效对象"记忆复制的目的. - 读取为定义的行为, iff ,我们得出结论,C ++§3.9/3并未规范地将
memcpy
的适用性限制为其他类型或内存范围,并得出以下结论: C99标准中的memcpy
(和对象项")的定义允许将相邻变量视为单个对象连续字节目标.
- 读取为未定义行为, iff ,我们得出结论,C ++§3.9/3表示普通可复制类型的 only (完整)对象是允许由
Disclaimer: This is trying to drill down on a larger problem, so please don't get hung up with whether the example makes any sense in practice.
And, yes, if you want to copy objects, please use / provide the copy-constructor. (But note how even the example does not copy a whole object; it tries to blit some memory over a few adjacent(Q.2) integers.)
Given a C++ Standard Layout
struct
, can I usememcpy
to write to multiple (adjacent) sub-objects at once?Complete example: (
https://ideone.com/1lP2Gdhttps://ideone.com/YXspBk)#include <vector> #include <iostream> #include <assert.h> #include <inttypes.h> #include <stddef.h> #include <memory.h> struct MyStandardLayout { char mem_a; int16_t num_1; int32_t num_2; int64_t num_3; char mem_z; MyStandardLayout() : mem_a('a') , num_1(1 + (1 << 14)) , num_2(1 + (1 << 30)) , num_3(1LL + (1LL << 62)) , mem_z('z') { } void print() const { std::cout << "MySL Obj: " << mem_a << " / " << num_1 << " / " << num_2 << " / " << num_3 << " / " << mem_z << "\n"; } }; void ZeroInts(MyStandardLayout* pObj) { const size_t first = offsetof(MyStandardLayout, num_1); const size_t third = offsetof(MyStandardLayout, num_3); std::cout << "ofs(1st) = " << first << "\n"; std::cout << "ofs(3rd) = " << third << "\n"; assert(third > first); const size_t delta = third - first; std::cout << "delta = " << delta << "\n"; const size_t sizeAll = delta + sizeof(MyStandardLayout::num_3); std::cout << "sizeAll = " << sizeAll << "\n"; std::vector<char> buf( sizeAll, 0 ); memcpy(&pObj->num_1, &buf[0], sizeAll); } int main() { MyStandardLayout obj; obj.print(); ZeroInts(&obj); obj.print(); return 0; }
Given the wording in the C++ Standard:
9.2 Class Members
...
13 Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object. (...) Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; (...)
I would conclude that it is guaranteed that
num_1
tonum_3
have increasing addresses and are adjacent modulo padding.For the above example to be fully defined, I see these requirements, of which I am not sure they hold:
memcpy
must be allowed to write to multiple "memory objects" in this way at once, i.e. specifically- Calling
memcpy
with the target address ofnum_1
and a size that is larger than the size of thenum_1
"object" is legal. (Given thatnum_1
is not part of an array.) (Is memcpy(&a + 1, &b + 1, 0) defined in C11? seems a good related question, but doesn't quite fit.) - The C++ (14) Standard, AFAICT, refers description of
memcpy
to the C99 Standard, and that one states:
7.21.2.1 The memcpy function
2 The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.
So for me the question here wrt. this is whether the target range we have here can be considered "an object" according to the C or C++ Standard. Note: A (part of an) array of chars, declared and defined as such, certainly can be assumed to count as "an object" for the purposes of
memcpy
because I'm pretty sure I'm allowed to copy from one part of a char array to another part of (another) char array.So then the question would be if it is legal to reinterpret the memory range of the three members as a "conceptual"(?) char array.
- Calling
Calculating
sizeAll
is legal, that is usage ofoffsetof
is legal as shown.Writing to the padding in between the members is legal.
Do these properties hold? Have I missed anything else?
解决方案Putting this as a partial answer wrt.
memcpy(&num_1, buf, sizeAll)
:Note: James' answer is much more concise and definitive.
I asked:
memcpy
must be allowed to write to multiple "memory objects" in this way at once, i.e. specifically- Calling
memcpy
with the target address ofnum_1
and a size that is larger than the size of thenum_1
"object" is legal. - The [C++ (14) Standard][2], AFAICT, refers description of
memcpy
to the [C99 Standard][3], and that one states:
7.21.2.1 The memcpy function
2 The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.
So for me the question here wrt. this is whether the target range we have here can be considered "an object" according to the C or C++ Standard.
- Calling
Thinking and searching a bit more, I found in the C Standard:
§ 6.2.6 Representations of types
§ 6.2.6.1 General
2 Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined.
So at least it is implied that "an object" => "contiguous sequence of bytes".
I'm not so bold to claim that the inverse -- "contiguous sequence of bytes" => "an object" -- holds, but at least "an object" doesn't seem to be defined more strictly here.
Then, as quoted in Q, §9.2/13 of the C++ Standard (and § 1.8/5) seem to guarantee that we do have a contiguous sequence of bytes (including padding).
Then, §3.9/3 says:
3 For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2, obj2 shall subsequently hold the same value as obj1. [ Example:
T* t1p; T* t2p; // provided that t2p points to an initialized object ... std::memcpy(t1p, t2p, sizeof(T)); // at this point, every subobject of trivially copyable type in *t1p contains // the same value as the corresponding subobject in *t2p
—end example ]
So this explicitly allows the application of
memcpy
to whole objects of Trivially Copyable types.In the example, the three members comprise a "trivially copyable sub-object", and indeed I think wrapping them in an actual subobject of distinct type would still mandate exactly the same memory layout for the explicit object as for the three members:
struct MyStandardLayout_Flat { char mem_a; int16_t num_1; int32_t num_2; int64_t num_3; char mem_z; }; struct MyStandardLayout_Sub { int16_t num_1; int32_t num_2; int64_t num_3; }; struct MyStandardLayout_Composite { char mem_a; // Note that the padding here is different from the padding in MyStandardLayout_Flat, but that doesn't change how num_* are layed out. MyStandardLayout_Sub nums; char mem_z; };
The memory layout of
nums
in_Composite
and the three members of_Flat
should be layed out completely the same, because the same basic rules apply.So in conclusion, given that the "sub object" num_1 to num_3 will be represented by an equivalent contiguous sequence of bytes as a full Trivially Copyable sub-object, I:
- have a very, very hard time imagining an implementation or optimizer that breaks this
- Would say it either can be:
- read as Undefined Behavior, iff we conclude that C++§3.9/3 implies that only (full) objects of Trivially Copyable Type are allowed to be be treated thusly by
memcpy
or conclude from C99§6.2.6.1/2 and the general spec ofmemcpy
7.21.2.1 that the contiguous sequence of the num_* bytes does not comprise a "valid object" for the purposes of memcopy. - read as Defined Behavior, iff we conclude that C++§3.9/3 does not normatively limit the applicability of
memcpy
to other types or memory ranges and conclude that the definition ofmemcpy
(and the "object term") in the C99 Standard allows to treat adjacent variables as a single object contiguous bytes target.
- read as Undefined Behavior, iff we conclude that C++§3.9/3 implies that only (full) objects of Trivially Copyable Type are allowed to be be treated thusly by
这篇关于我可以使用memcpy写入多个相邻的“标准布局"子对象吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- 以目标地址