如何使用动态分配的任意维数组? [英] How can I work with dynamically-allocated arbitrary-dimensional arrays?

查看:27
本文介绍了如何使用动态分配的任意维数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

典型的一维数组可以在声明中静态或自动分配.

The typical 1-D array can be statically or automatically allocated in a declaration.

enum { n=100 };
int arr1[n];

或者通过指针动态分配和访问.

Or dynamically-allocated and accessed via a pointer.

int *arr1m=malloc(n*sizeof*arr1m);
int *arr1c=calloc(n, sizeof*arr1c);

这两种样式都使用相同的语法访问元素.

Both of these styles access an element with the same syntax.

int i = n/2;
arr1[i] = arr1c[i] = arr1m[i] = 42;

但是当您添加第二个维度时,需要付出一些努力才能实现相同的语法.

But when you add a second dimension, it takes some effort to achieve the same syntax.

int arr2[n][n];
int *arr2c=calloc(n*n,sizeof*arr2c);
arr2[5][5] = arr2c[5*n+5] = 23;

如果您将其构造为 Iliffe-vector,则您只能获得双括号.

You can only get the doubled set of brackets if instead you construct it as an Iliffe-vector.

int **arr2l=calloc(n,sizeof*arr2l);
for (int j=0; j<n; j++)
    arr2l[j]=calloc(n,sizeof**arr2l);
 arr2[6][6] = arr2l[6][6] = 72;

但是随着尺寸的增加,这变得越来越麻烦.

But this becomes more and more cumbersome as the dimensions increase.

另一个难点是在访问元素之前检查动态数组的边界(这样您就不会触及未正确分配的内存).real 数组可以使用 sizeof 运算符来确定边界,但这些动态数组中没有一个带有它们的大小.

Another difficulty is checking the bounds of a dynamic array before accessing an element (so that you're not touching memory that was not properly allocated). The real arrays can use the sizeof operator to determine the bounds, but none of these dynamic arrays carry their size with them.

我如何定义一个结构,它像数组一样具有快速、连续的布局,但具有一致的语法来访问具有索引列表的元素,这些索引列表对于 2D 数组和 3D 数组的工作方式相同;并且都是动态的,具有动态可用的大小,以便它们可以传递给函数并从函数返回?

How can I define a structure that has fast, contiguous layout like an array, but with a consistent syntax for accessing elements with a list of indices that works the same for a 2D array as for a 3D array; and all dynamically, with sizes available dynamically so they can be passed to and returned from functions?

推荐答案

在 J 语言(APL 的一种方言)的实现中使用了一种数据结构,它可以容纳动态分配的任意维数组.它混合使用 struct 和动态数组作为其数据结构,这种技巧通常被称为 struct hack.(关于 J 实现的更多信息这里这里.)

There is a data structure used in the implementation of the J language (a dialect of APL) which accommodates dynamically-allocated arbitrary-dimensional arrays. It uses a hybrid of a struct and a dynamic array for its data structure, a trick commonly known as the struct hack. (More info on the J implementation here and here.)

为了在一个简单的上下文中看到这个想法,考虑一维情况:我们想要一个动态的一维数组,它带有它的大小.所以:

To see the idea in a simple context, consider the 1D case: we want a dynamic 1D array which carries its size along with it. So:

struct vec { int n; int p[]; };

由于 p 成员是最后一个,而且 C 没有内置的边界检查,它可以用来访问 末尾的额外内存结构.当然,在分配时,我们需要提供这些额外的内存,而不是简单地分配struct 的大小.struct 只是数组的header.C90 需要一个数字(比如 1)来表示 p[] 数组的长度,但 C99 允许省略该数字,因此头的大小更易于计算.

Since the p member is last, and C has no built-in bounds checking, it can be used to access additional memory off the end of the struct. Of course, when allocating, we'll need to provide this extra memory and not simply allocate the size of the struct. The struct is just the header of the array. C90 requires a number (say, 1) for the length of the p[] array, but C99 allows the number to be omitted so the size of the header is more straightforward to calculate.

因此,具有更多维度的数组将需要更多值来保存每个维度的大小.为了让我们的结构适应不同维度的数组,这个维度向量也需要是可变长度的.

So, an array with more dimensions will need more values to hold the sizes of each dimension. And for our structure to accommodate arrays of different dimensionality, this dimension vector will need to be variable-length as well.

为了实现这一切,我们可以做的是两次应用 struct hack,递归地应用于自身.这给了我们一个这样的内存布局,其中 R 是我们称之为数组的 rank 的维数,Dvalues 是每个维度的长度,V 值是实际的数组数据:

What we can do to achieve all of this is to apply the struct hack twice, recursively upon itself. This gives us a memory layout like this, where R is the number of dimensions which we'll call the rank of the array, the D values are the lengths of each dimension, and the V values are the actual array data:

  1   R                    Product(D) 
 --- -------------------- ----------------------------- 
  R  D[0] D[1] ... D[R-1] V[0] V[1] ... V[Product(D)-1] 

并用 C 来描述这一点,

And to describe this in C,

typedef struct arr { int r; int d[]; } *arr;

数组a 的元素紧跟在dims 向量DR 元素之后.所以 V 元素可以在 a->d[r+0], a->d[r+1], ... a->d[r+i](在将索引向量减少到扁平表示上的单个索引之后).元素最容易按行优先顺序处理.实际元素的数量是所有维度的乘积,所有维度相乘.这里的表达式可能更好写:(a->d+a->r)[0], (a->d+a->r)[1], ... (a->d+a->r)[i].

The elements of an array a immediately follow the R elements of the dims vector D. So the V elements can be accessed at a->d[r+0], a->d[r+1], ... a->d[r+i] (after reducing the index vector down to a single index on the flattened representation). The elements are easiest to treat in row-major order. The number of actual elements is the product of all the dimensions, all the dimensions multiplied together. The expression here might be better written: (a->d+a->r)[0], (a->d+a->r)[1], ... (a->d+a->r)[i].

为了分配这些东西之一,我们需要一个函数来计算这个乘积作为大小计算的一部分.

In order to allocate one of these things, we'll need a function to compute this product as part of the size calculation.

int productdims(int rank, int *dims){
    int z=1;
    for(int i=0; i<rank; i++)
        z *= dims[i];
    return z;
}

为了初始化,我们只需要填写成员.

And to initialize, we just need to fill-in the members.

arr makearr(int rank, int *dims){
    arr z = calloc( (sizeof(struct arr)/sizeof(int)) + 
                rank + productdims(rank,dims), sizeof(int));
    z->r = rank;
    memmove(z->d,dims,rank*sizeof(int));
    return z;
}

记住使用单个索引(如问题中的典型动态数组)访问二维数据(例如 [m][n] 个元素的数组)的公式.元素 [i][j] 位于 i×n+j.对于 3D 数组 [m][n][o],元素 [i][j][k] 位于 i×(n×o)+j×o+k.

Remembering the formula for accessing 2D data (say an array of [m][n] elements) with a single index (it's a typical dynamic array like in the question). Element [i][j] is at i×n+j. With a 3D array [m][n][o], element [i][j][k] is at i×(n×o)+j×o+k.

因此,我们可以根据索引数组和维度数组为线性布局的数据计算单个索引.

So we can calculate a single index for our linearly-laid-out data from an array of indices and the array of dimensions.

int *elem(arr a, ...){
    va_list ap;
    int idx = 0;

    va_start(ap,a);
    if (a->r){
        idx = va_arg(ap,int);
        for(int i=1; i<a->r; i++){
            idx *= a->d[i];
            idx += va_arg(ap,int);
        }
    }
    va_end(ap);

    return &a->d[a->r + idx];
}

不要忘记标题:

#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

瞧:

int main() {    

    {
        int i,n=6;
        arr m = makearr(1, (int[]){n});
        for (i=0;i<n;i++)
            *elem(m,i) = i;
        for (i=0;i<n;i++,printf(" "))
            printf("%d",*elem(m,i));
    }
    puts("\n");

    {
        int i,j,n=4;
        arr m = makearr(2, (int[]){n,n});
        for (i=0;i<n;i++)
            for (j=0;j<n;j++)
                *elem(m,i,j) = i*n+j;
        for (i=0;i<n;i++,printf("\n"))
            for (j=0;j<n;j++,printf(" "))
                printf("%d",*elem(m,i,j));
    }
    puts("\n");

    {
        int i,j,k,n=3;
        arr m = makearr(3, (int[]){n,n,n});
        for (i=0;i<n;i++)
            for (j=0;j<n;j++)
                for (k=0;k<n;k++)
                    *elem(m,i,j,k) = (i*n+j)*n+k;
        for (i=0;i<n;i++,printf("\n"))
            for (j=0;j<n;j++,printf("\n"))
                for (k=0;k<n;k++,printf(" "))
                    printf("%d",*elem(m,i,j,k));
    }

    return 0;
}

输出:

0 1 2 3 4 5 

0 1 2 3 
4 5 6 7 
8 9 10 11 
12 13 14 15 


0 1 2 
3 4 5 
6 7 8 

9 10 11 
12 13 14 
15 16 17 

18 19 20 
21 22 23 
24 25 26 

此代码的进一步详细说明以支持 2D 矩阵乘法和列切片已发布到 comp.lang.c这个话题.

Further elaboration of this code to support 2D matrix multiplication and column slices has been posted to comp.lang.c in this thread.

更有动力问题/更强大的数据结构.

这篇关于如何使用动态分配的任意维数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆