使用指向元素的指针进行多维数组索引 [英] Multidimensional array indexing using pointer to elements

查看:156
本文介绍了使用指向元素的指针进行多维数组索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我所知,堆栈上的多维数组将按行顺序占用连续的内存。使用根据ISO C ++标准指向元素的指针索引多维数组是否是未定义的行为?例如:

  #include< iostream> 
#include< type_traits>
int main(){
int a [5] [4] {{1,2,3,4},{},{5,6,7,8}};
constexpr auto sz = sizeof(a)/ sizeof(std :: remove_all_extents< decltype(a)> :: type);
int * p =& a [0] [0];
int i = p [11]; //<-在这里
p [19] = 20; //<-在这里
for(int k = 0; k< sz; ++ k)
std :: cout<< p [k]<< ’’; //<--这里
返回0;
}

如果指针未超出边界,上述代码将正确编译并运行数组 a 的数组。但这是由于编译器定义的行为或语言标准而发生的吗?

解决方案

这里的问题是严格的别名规则 >在我的草案N3337 for C ++ 11中存在于3.10 Lvalues和rvalues [basic.lval]§10中。这是一个详尽的列表,不允许明确地将多维数组别名为整个大小的一维数组。 / p>

因此,即使确实需要在内存中连续分配数组,这也证明了多维数组的大小,例如 T arr [n] [m] 是is维数与元素大小的乘积: n * m * sizeof(T)。当转换为char指针时,您甚至可以在整个数组上执行算术指针操作,因为指向对象的任何指针都可以转换为char指针,并且该char指针可用于访问对象的连续字节(*)。



但是不幸的是,对于其他任何类型,该标准只允许在一个数组内进行算术指针操作(并且根据定义取消引用数组元素与在指针算术之后取消引用指针相同 a [i] *( a + i))。因此,如果您都遵守指针算术规则和严格的别名规则,则除非您通过char指针算术进行操作,否则C ++ 11标准不会定义多维数组的全局索引:

  int a [3] [4]; 
int * p =& a [0] [0]; //完美定义的
int b = p [3]; //好的,您在同一行中,这意味着在同一数组中
b = p [5]; // OUPS:您通过构建第一行的声明数组取消引用

char * cq =(((char *)p)+ 5 * sizeof(int)); //好的:对象
中的char指针算术int * q =(int *)cq; //好的,因为这里有一个int对象
b = * q; //与p [5]几乎相同,但行为已定义

char指针算术加上担心会破坏大量现有代码的原因,这解释了为什么所有知名的编译器都默默地接受具有一维全局大小的一维多维数组的别名(它导致相同的内部代码),但是从技术上讲,全局指针算术仅对char指针有效。






(*)标准在1.7中声明C ++内存模型[intro.memory] ​​


C ++内存模型中的基本存储单元是字节... C ++程序可用的内存由一个或多个连续字节序列组成。每个
字节都有一个唯一的地址。


以及后来的3.9类型[basic.types]§2


对于平凡可复制类型T的任何对象(基类子对象除外),对象
是否持有有效值。类型T,可以将组成对象的基础字节复制到char或未签名char数组
中。


要复制它们,您必须通过 char * unsigned char *


访问它们

As far as I know, multidimensional array on stack will occupy continuous memory in row order. Is it undefined behavior to index multidimensional array using a pointer to elements according to ISO C++ Standard? For example:

#include <iostream>
#include <type_traits>
int main() {
  int a[5][4]{{1,2,3,4},{},{5,6,7,8}};
  constexpr auto sz = sizeof(a) / sizeof(std::remove_all_extents<decltype(a)>::type);
  int *p = &a[0][0];
  int i = p[11];  // <-- here
  p[19] = 20;  // <-- here
  for (int k = 0; k < sz; ++k)
    std::cout << p[k] << ' ';  // <-- and here
  return 0;
}

Above code will compile and run correctly if pointer does not go out of the boundary of array a. But is this happen because of compiler defined behavior or language standard? Any reference from the ISO C++ Standard would be best.

解决方案

The problem here is the strict aliasing rule that exists in my draft n3337 for C++11 in 3.10 Lvalues and rvalues [basic.lval] § 10. This is an exhaustive list that does not explicetely allow to alias a multidimensional array to an unidimensional one of the whole size.

So even if it is indeed required that arrays are allocated consecutively in memory, which proves that the size of a multidimensional array, say for example T arr[n][m] is the product of is dimensions by the size of an element: n * m *sizeof(T). When converted to char pointers, you can even do arithmetic pointer operations on the whole array, because any pointer to an object can be converted to a char pointer, and that char pointer can be used to access the consecutive bytes of the object (*).

But unfortunately, for any other type, the standard only allow arithmetic pointer operations inside one array (and by definition dereferening an array element is the same as dereferencing a pointer after pointer arithmetics: a[i] is *(a + i)). So if you both respect the rule on pointer arithmetics and the strict aliasing rule, the global indexing of a multi-dimensional array is not defined by C++11 standard, unless you go through char pointer arithmetics:

int a[3][4];
int *p = &a[0][0]; // perfectly defined
int b = p[3];      // ok you are in same row which means in same array
b = p[5];          // OUPS: you dereference past the declared array that builds first row

char *cq = (((char *) p) + 5 * sizeof(int)); // ok: char pointer arithmetics inside an object
int *q = (int *) cq; // ok because what lies there is an int object
b = *q;            // almost the same as p[5] but behaviour is defined

That char pointer arithmetics along with the fear of breaking a lot of existing code explains why all well known compiler silently accept the aliasing of a multi-dimensional array with a 1D one of same global size (it leads to same internal code), but technically, the global pointer arithmetics is only valid for char pointers.


(*) The standard declares in 1.7 The C++ memory model [intro.memory] that

The fundamental storage unit in the C++ memory model is the byte... The memory available to a C++ program consists of one or more sequences of contiguous bytes. Every byte has a unique address.

and later in 3.9 Types [basic.types] §2

For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes making up the object can be copied into an array of char or unsigned char.

and to copy them you must access them through a char * or unsigned char *

这篇关于使用指向元素的指针进行多维数组索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆