MPI_reduce()包含动态分配的一阳的自定义数据类型:分段故障 [英] MPI_reduce() with custom Datatype containing dynamically allocated arays : segmentation fault

查看:180
本文介绍了MPI_reduce()包含动态分配的一阳的自定义数据类型:分段故障的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不明白为什么MPI_Reduce()一旦做了段错误,因为我用一个自定义的数据类型的MPI包含动态分配数组。有人知道吗 ?下面code有2个处理器崩溃,MPI_Reduce()内。
但是如果我删除成员双* D INT的MyType,并相应地改变了运营商和MPI类型例程中,还原没有任何问题做了。

有没有使用动态分配数组或一个问题,就是有什么根本性错误我做什么:

 的#include<&stdio.h中GT;
#包括LT&;&stdlib.h中GT;
#包括LT&;&mpi.h GT;typedef结构mytype_s
{
    诠释三[2];
    双A;
    双B:
    双* D;
} MyType的;无效CreateMyTypeMPI(MyType的*山,MPI_Datatype * MyTypeMPI)
{
    INT block_lengths [4]; // ELT排名。在每个块中
    MPI_Aint位移[4]; // displac。
    MPI_Datatype类型串[4]; //类型列表
    MPI_Aint START_ADDRESS,地址; //用于计算displac。
    MPI_Datatype的myType;    block_lengths [0] = 2;
    block_lengths [1] = 1;
    block_lengths [2] = 1;
    block_lengths [3] = 10;    类型串[0] = MPI_INT;
    类型串[1] = MPI_DOUBLE;
    类型串[2] = MPI_DOUBLE;
    类型串[3] = MPI_DOUBLE;    位移[0] = 0;    MPI_Address(安培; MT-&℃,&放大器; START_ADDRESS);
    MPI_Address(安培; MT->一种,&安培;地址);
    位移[1] =地址 - START_ADDRESS;    MPI_Address(安培; MT-GT&; B,&安培;地址);
    位移[2] =地址START_ADDRESS;    MPI_Address(安培; MT-D 1和D,&安培;地址);
    位移[3] =地址START_ADDRESS;    MPI_Type_struct(4 block_lengths,位移,类型串,MyTypeMPI);
    MPI_Type_commit(MyTypeMPI);
}
无效MyTypeOp(的MyType *中,MyType的*总分,为int * LEN,MPI_Datatype * typeptr)
{
    INT I;
    诠释J;    对于(i = 0; I< * len个;我++)
    {
        出[I] .A + =在[I] .A;
        出[I] .B + =在[I] .B;
        出[I] .c的[0] + = [Ⅰ]中的.c [0];
        出由[i] .c的[1] + =在[I]的.c [1];        为(J = 0; J&小于10; J ++)
        {
            出[I] .D [J] + = [I]中.D [J]。
        }
    }
}
INT主(INT ARGC,字符** argv的)
{
    MyType的吨;
    MyType的MT2;
    MPI_Datatype MyTypeMPI;
    MPI_Op MYOP;
    INT排名;
    INT I;    MPI_INIT(安培; ARGC,&安培; argv的);
    MPI_Comm_rank(MPI_COMM_WORLD,&安培;等级);
    mt.a = 2;
    mt.b = 4;
    mt.c [0] = 6;
    mt.c [1] = 8;
    mt.d =释放calloc(10,sizeof的* mt.d);
    对于(I = 0; I&小于10;我++)mt.d [I] = 2.1;    mt2.a = 0;
    mt2.b = 0;
    mt2.c [0] = mt2.c [1] = 0;
    mt2.d =释放calloc(10,sizeof的* mt2.d);
    CreateMyTypeMPI(安培; MT,&安培; MyTypeMPI);
    MPI_Op_create((MPI_User_function *)MyTypeOp,1,&安培; MYOP);    如果(排名== 0)printf的(类型和运营商现在正在创建\\ n);    MPI_Reduce(安培; MT,&安培; mt2,1,MyTypeMPI,MYOP,0,MPI_COMM_WORLD);    如果(排名== 0)
    {
        对于(I = 0; I&小于10;我++)printf的(%F,mt2.d [I]);
        的printf(\\ n);
    }    免费(mt.d);
    免费(mt2.d);
    MPI_Finalize();    返回0;
}


解决方案

让我们看看你的结构:

  typedef结构mytype_s
{
    诠释三[2];
    双A;
    双B:
    双* D;
} MyType的;...MyType的吨;
mt.d =释放calloc(10,sizeof的* mt.d);

和您的这个结构的描述为MPI类型:

 位移[0] = 0;MPI_Address(安培; MT-&℃,&放大器; START_ADDRESS);
MPI_Address(安培; MT->一种,&安培;地址);
位移[1] =地址 - START_ADDRESS;MPI_Address(安培; MT-GT&; B,&安培;地址);
位移[2] =地址START_ADDRESS;MPI_Address(安培; MT-D 1和D,&安培;地址);
位移[3] =地址START_ADDRESS;MPI_Type_struct(4 block_lengths,位移,类型串,MyTypeMPI);

问题是,这种结构MPI只的曾经的将应用到您在定义中使用此结构的一个实例。你必须在所有的,其中释放calloc()决定从抢内存无法控制的;它可能是在虚拟存储器中的任何。这些类型的下一个你的创建和实例化时,位移矢量D 阵列将完全不同;甚至使用相同的结构,如果用改变数组的大小的realloc()当前 MT的,它最终可能具有不同的排量。

所以,当你发送,接收,减少或任何其他与这些类型之一,MPI库将尽职尽责地去可能毫无意义的位移,并尝试读取或写入存在,而且很可能会导致段错误

请注意,这不是一个MPI的事情;在使用任何低级别的通信库,或为此事努力写出/在从磁盘读取,你有同样的问题。

您选择手动包括编组的数组的消息,无论是与其他领域或无;或者通过定义它加入一些predictability到其中d位于例如是一些限定的最大尺寸的阵列

I don't get why MPI_Reduce() does a segmentation fault as soon as I use a custom MPI datatype which contains dynamically allocated arrays. Does anyone know ? The following code crashes with 2 processors, inside the MPI_Reduce(). However If I remove the member double *d int MyType and changes the operator and MPI type routines accordingly, the reduction is done without any problem.

Is there a problem using dynamically allocated arrays or is there something fundamentally wrong with what I do :

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>



typedef struct mytype_s
{
    int c[2];
    double a;
    double b;
    double *d;
} MyType;



void CreateMyTypeMPI(MyType *mt, MPI_Datatype *MyTypeMPI)
{
    int block_lengths[4];                        // # of elt. in each block
    MPI_Aint displacements[4];                   // displac.
    MPI_Datatype typelist[4];                    // list of types
    MPI_Aint start_address, address;            // use for calculating displac.
    MPI_Datatype myType;

    block_lengths[0] = 2;
    block_lengths[1] = 1;
    block_lengths[2] = 1;
    block_lengths[3] = 10;

    typelist[0] = MPI_INT;
    typelist[1] = MPI_DOUBLE;
    typelist[2] = MPI_DOUBLE;
    typelist[3] = MPI_DOUBLE;

    displacements[0] = 0;

    MPI_Address(&mt->c, &start_address);
    MPI_Address(&mt->a, &address);
    displacements[1] = address - start_address;

    MPI_Address(&mt->b,&address);
    displacements[2] = address-start_address;

    MPI_Address(&mt->d, &address);
    displacements[3] = address-start_address;

    MPI_Type_struct(4,block_lengths, displacements,typelist,MyTypeMPI);
    MPI_Type_commit(MyTypeMPI);
}




void MyTypeOp(MyType *in, MyType *out, int *len, MPI_Datatype *typeptr)
{
    int i;
    int j;

    for (i=0; i < *len; i++)
    {
        out[i].a += in[i].a;
        out[i].b += in[i].b;
        out[i].c[0] += in[i].c[0];
        out[i].c[1] += in[i].c[1];

        for (j=0; j<10; j++)
        {
            out[i].d[j] += in[i].d[j];
        }
    }
}




int main(int argc, char **argv)
{
    MyType mt;
    MyType mt2;
    MPI_Datatype MyTypeMPI;
    MPI_Op MyOp;
    int rank;
    int i;

    MPI_Init(&argc,&argv);
    MPI_Comm_rank(MPI_COMM_WORLD,&rank);


    mt.a = 2;
    mt.b = 4;
    mt.c[0] = 6;
    mt.c[1] = 8;
    mt.d = calloc(10,sizeof *mt.d);
    for (i=0; i<10; i++) mt.d[i] = 2.1;

    mt2.a = 0;
    mt2.b = 0;
    mt2.c[0] = mt2.c[1] = 0;
    mt2.d = calloc(10,sizeof *mt2.d);


    CreateMyTypeMPI(&mt, &MyTypeMPI);
    MPI_Op_create((MPI_User_function *) MyTypeOp,1,&MyOp);

    if(rank==0) printf("type and operator are created now\n");

    MPI_Reduce(&mt,&mt2,1,MyTypeMPI,MyOp,0,MPI_COMM_WORLD);

    if(rank==0)
    {




        for (i=0; i<10; i++) printf("%f ",mt2.d[i]);
        printf("\n");
    }

    free(mt.d);
    free(mt2.d);
    MPI_Finalize();

    return 0;
}

解决方案

Let's look at your struct:

typedef struct mytype_s
{
    int c[2];
    double a;
    double b;
    double *d;
} MyType;

...

MyType mt;
mt.d = calloc(10,sizeof *mt.d);

And your description of this struct as an MPI type:

displacements[0] = 0;

MPI_Address(&mt->c, &start_address);
MPI_Address(&mt->a, &address);
displacements[1] = address - start_address;

MPI_Address(&mt->b,&address);
displacements[2] = address-start_address;

MPI_Address(&mt->d, &address);
displacements[3] = address-start_address;

MPI_Type_struct(4,block_lengths, displacements,typelist,MyTypeMPI);

The problem is, this MPI struct is only ever going to apply to the one instance of the structure you've used in the definition here. You have no control at all of where calloc() decides to grab memory from; it could be anywhere in virtual memory. The next one of these type you create and instantiate, the displacement for your d array will be completely different; and even using the same struct, if you change the size of the array with realloc() of the current mt, it could end up having a different displacement.

So when you send, receive, reduce, or anything else with one of these types, the MPI library will dutifully go to a probably meaningless displacement, and try to read or write from there, and that'll likely cause a segfault.

Note that this isn't an MPI thing; in using any low-level communications library, or for that matter trying to write out/read in from disk, you'd have the same problem.

Your options include manually "marshalling" the array into a message, either with the other fields or without; or adding some predictability to where d is located such as by defining it to be an array of some defined maximum size.

这篇关于MPI_reduce()包含动态分配的一阳的自定义数据类型:分段故障的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆