结构和内存使用模式的阵列结构,阵列 [英] Struct of arrays, arrays of structs and memory usage pattern

查看:138
本文介绍了结构和内存使用模式的阵列结构,阵列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在阅读有关SOA和我想尝试在我建立了一个系统中实现它。

我写一些简单的C结构做了一些测试,但我有点糊涂了,现在我有3种不同的结构在 VEC3 。下面我会告诉他们,然后进入有关这一问题的进一步的细节。

 结构VEC3
{
为size_t的x,y和z;
};结构vec3_a
{
为size_t POS [3];
};结构vec3_b
{
为size_t * X;
为size_t * Y;
为size_t * Z;
};结构VEC3 VEC3(为size_t X,为size_t Y,为size_t Z)
{
    结构VEC3伏;
    V.X = X;
    V.Y = Y;
    V.Z = Z;
    返回伏;
}结构vec3_a vec3_a(为size_t X,为size_t Y,为size_t Z)
{
    结构vec3_a伏;
    v.pos [0] = X;
    v.pos [1] = Y;
    v.pos [2] = Z;
    返回伏;
}结构vec3_b vec3_b(为size_t X,为size_t Y,为size_t Z)
{
    结构vec3_b伏;
    V.X =(为size_t *)malloc的(的sizeof(为size_t));
    V.Y =(为size_t *)malloc的(的sizeof(为size_t));
    V.Z =(为size_t *)malloc的(的sizeof(为size_t));
    *(V.X)= X;
    *(V.Y)= Y;
    *(V.Z)= Z;
    返回伏;
}

这三种VEC3的声明。

 结构VEC3 V = VEC3(10,20,30);
结构vec3_a VA = vec3_a(10,20,30);
结构vec3_b VB = vec3_b(10,20,30);

用printf的打印出来的地址,我得到这样的值:

  VEC3尺寸:24字节
vec3a尺寸:24字节
vec3b尺寸:24字节
为size_t尺寸:8字节
INT尺寸:4个字节
16 INT尺寸:64字节
VEC3 X:10 Y:20 Z:30
VEC3 X:0x7fff57f8e788,Y:0x7fff57f8e790,Z:0x7fff57f8e798
vec3a X:10 Y:20 Z:30
vec3a X:0x7fff57f8e768,Y:0x7fff57f8e770,Z:0x7fff57f8e778
vec3b X:10 Y:20 Z:30
vec3b X:0x7fbe514026a0,Y:0x7fbe51402678,Z:0x7fbe51402690

这是我做的最后一件事是创建10个结构 vec3_b 的数组,并打印出来,这回这些值的地址。

 结构vec3_b VB3 [10];
    的for(int i = 0;我小于10;我++)
    {
        VB3 [I] = vec3_b(I,I * 2,我* 4);
    }指数:0 vec3b X:0x7fbe514031f0,Y:0x7fbe51403208,Z:0x7fbe51403420
指数:1 vec3b X:0x7fbe51403420,Y:0x7fbe51403438,Z:0x7fbe51403590
指数:2 vec3b X:0x7fbe51403590,Y:0x7fbe514035a8,Z:0x7fbe514035c0
指数:3 vec3b X:0x7fbe514035c0,Y:0x7fbe514035d8,Z:0x7fbe514035f0
指数:4 vec3b X:0x7fbe514035f0,Y:0x7fbe51403608,Z:0x7fbe51403680
指数:5 vec3b X:0x7fbe51403680,Y:0x7fbe51403698,Z:0x7fbe514036b0
指数:6 vec3b X:0x7fbe514036b0,Y:0x7fbe514036c8,Z:0x7fbe514036e0
指数:7 vec3b X:0x7fbe514036e0,Y:0x7fbe514036f8,Z:0x7fbe51403710
指数:8 vec3b X:0x7fbe51403710,Y:0x7fbe51403728,Z:0x7fbe51403740
指数:9 vec3b X:0x7fbe51403740,Y:0x7fbe51403758,Z:0x7fbe51403770

问题:


  1. 是我的执行结构vec3_b 的正确方法设置阵列的结构呢?


  2. 由于 vec_3b 结构是24个字节大,我可以在1现代CPU的缓存行容纳2加12个额外的字节?


  3. 如果我的 vec3_b 是做SOA的设置,我有一些麻烦的处理,在这里我把10 vec3_b在一起的正确方法。


在十六进制值与十进制重新presentations找我不能看到这使我相信,我的设置是不正确的模式。

  ---------------x-----------------|----------------y-----------------|----------------z-----------------|0 | 0x7fbe514031f0:140455383675376 | 0x7fbe51403208:140455383675400 | 0x7fbe51403420:140455383675936
1 | 0x7fbe51403420:140455383675936 | 0x7fbe51403438:140455383675960 | 0x7fbe51403590:140455383676304
2 | 0x7fbe51403590:140455383676304 | 0x7fbe514035a8:140455383676328 | 0x7fbe514035c0:140455383676352


解决方案

  1. 我想不出一个场合时, vec_3b 将是一个不错的主意。


  2. 请注意,您还必须找到空间的指针在指向24字节数据,而且它可能不会是连续的结构本身,所以你可能只是减少了你的有效的缓存大小相比2倍 VEC3 vec_3a 。每个的malloc()有一个最小尺寸;在64位机器上,通常是至少16个字节。所以对于这三个三个独立的分配在 vec_3b 结构至少需要48为支持数据(加上24本身的结构)其它字节指着值。这并不适合在一个单一的高速缓存行;它不能保证被放置,使其适合2高速缓存行。


  3. N / A - 问题是pdicated上一个错误的假设$ P $


I've been reading about SOA and I wanted to try an implement it in a system that I am building up.

I am writing some simple C struct to do some tests but I am a bit confused, right now I have 3 different struct for a vec3. I will show them below and then go into further details about the question.

struct vec3
{
size_t x, y, z;
};

struct vec3_a
{
size_t pos[3];
};

struct vec3_b
{
size_t* x;
size_t* y;
size_t* z;
};

struct vec3 vec3(size_t x, size_t y, size_t z)
{
    struct vec3 v;
    v.x = x;
    v.y = y;
    v.z = z;
    return v;
}

struct vec3_a vec3_a(size_t x, size_t y, size_t z)
{
    struct vec3_a v;
    v.pos[0] = x;
    v.pos[1] = y;
    v.pos[2] = z;
    return v;
}

struct vec3_b vec3_b(size_t x, size_t y, size_t z)
{
    struct vec3_b v;
    v.x = (size_t*)malloc(sizeof(size_t));
    v.y = (size_t*)malloc(sizeof(size_t));
    v.z = (size_t*)malloc(sizeof(size_t));
    *(v.x) = x;
    *(v.y) = y;
    *(v.z) = z;
    return v;
}

That's the declarations of the three types of vec3.

struct vec3 v = vec3(10, 20, 30);
struct vec3_a va = vec3_a(10, 20, 30);
struct vec3_b vb = vec3_b(10, 20, 30);

Printing out the addresses with printf I get values like these:

size of vec3      : 24 bytes
size of vec3a     : 24 bytes
size of vec3b     : 24 bytes
size of size_t    : 8 bytes
size of int       : 4 bytes
size of 16 int    : 64 bytes
vec3 x:10, y:20, z:30
vec3 x:0x7fff57f8e788, y:0x7fff57f8e790, z:0x7fff57f8e798
vec3a x:10, y:20, z:30
vec3a x:0x7fff57f8e768, y:0x7fff57f8e770, z:0x7fff57f8e778
vec3b x:10, y:20, z:30
vec3b x:0x7fbe514026a0, y:0x7fbe51402678, z:0x7fbe51402690

One final thing that I did was create an array of 10 struct vec3_b and printed out the addresses which returned these values.

    struct vec3_b vb3[10];
    for(int i = 0; i < 10; i++)
    {
        vb3[i] = vec3_b(i, i*2, i*4);
    }

index:0 vec3b x:0x7fbe514031f0, y:0x7fbe51403208, z:0x7fbe51403420
index:1 vec3b x:0x7fbe51403420, y:0x7fbe51403438, z:0x7fbe51403590
index:2 vec3b x:0x7fbe51403590, y:0x7fbe514035a8, z:0x7fbe514035c0
index:3 vec3b x:0x7fbe514035c0, y:0x7fbe514035d8, z:0x7fbe514035f0
index:4 vec3b x:0x7fbe514035f0, y:0x7fbe51403608, z:0x7fbe51403680
index:5 vec3b x:0x7fbe51403680, y:0x7fbe51403698, z:0x7fbe514036b0
index:6 vec3b x:0x7fbe514036b0, y:0x7fbe514036c8, z:0x7fbe514036e0
index:7 vec3b x:0x7fbe514036e0, y:0x7fbe514036f8, z:0x7fbe51403710
index:8 vec3b x:0x7fbe51403710, y:0x7fbe51403728, z:0x7fbe51403740
index:9 vec3b x:0x7fbe51403740, y:0x7fbe51403758, z:0x7fbe51403770

Questions:

  1. Is my implementation of struct vec3_b the proper way to setup a struct of array?

  2. Since the vec_3b structure is 24 bytes large, I could fit 2 plus 12 additional bytes in 1 modern cpu's cache line?

  3. If my vec3_b is the proper way to do a SoA setup, I am having some trouble with the addressing, where I put 10 vec3_b together.

Looking at the hex values and their decimal representations I cannot see any pattern which leads me to believe that my setup is incorrect.

      ---------------x-----------------|----------------y-----------------|----------------z-----------------|

0|    0x7fbe514031f0 : 140455383675376 | 0x7fbe51403208 : 140455383675400 | 0x7fbe51403420 : 140455383675936
1|    0x7fbe51403420 : 140455383675936 | 0x7fbe51403438 : 140455383675960 | 0x7fbe51403590 : 140455383676304
2|    0x7fbe51403590 : 140455383676304 | 0x7fbe514035a8 : 140455383676328 | 0x7fbe514035c0 : 140455383676352

解决方案

  1. I can't think of an occasion when vec_3b would be a good idea.

  2. Note that you also have to find space for 24 bytes of data for the pointers to point at, and it probably won't be contiguous with the structure itself, so you have probably just reduced your effective cache size by a factor of 2 compared to vec3 or vec_3a. Each malloc() has a minimum size; on a 64-bit machine, that is usually at least 16 bytes. So three separate allocations for the three pointed at values in a vec_3b structure needs at least 48 other bytes for the supporting data (plus the 24 for the structure itself). That doesn't fit in a single cache line; it's not guaranteed to be placed so that it fits into 2 cache lines.

  3. N/A — the question is predicated on a false assumption.

这篇关于结构和内存使用模式的阵列结构,阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆