发送具有std :: vector成员的struct时出现分段错误 [英] Segmentation fault when sending struct having std::vector member

查看:146
本文介绍了发送具有std :: vector成员的struct时出现分段错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么使用mpirun -np 2 ./out命令的以下代码出现以下错误?在调整std::vector的大小后,我调用了make_layout(),因此通常我不会出现此错误.如果我不调整大小,它会起作用.是什么原因?

Why I get the following error for the following code with mpirun -np 2 ./out command? I called make_layout() after resizing the std::vector so normally I should not get this error. It works if I do not resize. What is the reason?

main.cpp:

#include <iostream>
#include <vector>
#include "mpi.h"

MPI_Datatype MPI_CHILD;

struct Child
{
    std::vector<int> age;

    void make_layout();
};

void Child::make_layout()
{
    int nblock = 1;
    int age_size = age.size();
    int block_count[nblock] = {age_size};
    MPI_Datatype block_type[nblock] = {MPI_INT};
    MPI_Aint offset[nblock] = {0};
    MPI_Type_struct(nblock, block_count, offset, block_type, &MPI_CHILD);
    MPI_Type_commit(&MPI_CHILD);
}

int main()
{
    int rank, size;

    MPI_Init(NULL, NULL);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);    

    Child kid;
    kid.age.resize(5);
    kid.make_layout();
    int datasize;
    MPI_Type_size(MPI_CHILD, &datasize);
    std::cout << datasize << std::endl; // output: 20 (5x4 seems OK).

    if (rank == 0)
    {
        MPI_Send(&kid, 1, MPI_CHILD, 1, 0, MPI_COMM_WORLD);
    }

    if (rank == 1)
    {
        MPI_Recv(&kid, 1, MPI_CHILD, 0, 0, MPI_COMM_WORLD, NULL);
    }

    MPI_Finalize();

    return 0;
}

错误消息:

*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x14ae7b8
[ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x113d0)[0x7fe1ad91c3d0]
[ 1] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x22)[0x7fe1ad5c5a92]
[ 2] ./out[0x400de4]
[ 3] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fe1ad562830]
[ 4] ./out[0x400ec9]
*** End of error message ***

推荐答案

下面是一个示例,其中包含几个std::vector成员,这些成员使用具有绝对地址的MPI数据类型:

Here is an example with several std::vector members that uses MPI datatypes with absolute addresses:

struct Child
{
    int foo;
    std::vector<float> bar;
    std::vector<int> baz;

    Child() : dtype(MPI_DATATYPE_NULL) {}
    ~Child() { if (dtype != MPI_DATATYPE_NULL) MPI_Type_free(dtype); }

    const MPI_Datatype mpi_dtype();
    void invalidate_dtype();

private:
    MPI_Datatype dtype;
    void make_dtype();
};

const MPI_Datatype Child::mpi_dtype()
{
    if (dtype == MPI_DATATYPE_NULL)
        make_dtype();
    return dtype;
}

void Child::invalidate_dtype()
{
    if (dtype != MPI_DATATYPE_NULL)
        MPI_Datatype_free(&dtype);
}

void Child::make_dtype()
{
    const int nblock = 3;
    int block_count[nblock] = {1, bar.size(), baz.size()};
    MPI_Datatype block_type[nblock] = {MPI_INT, MPI_FLOAT, MPI_INT};
    MPI_Aint offset[nblock];
    MPI_Get_address(&foo, &offset[0]);
    MPI_Get_address(&bar[0], &offset[1]);
    MPI_Get_address(&baz[0], &offset[2]);

    MPI_Type_struct(nblock, block_count, offset, block_type, &dtype);
    MPI_Type_commit(&dtype);
}

该类的使用示例:

Child kid;
kid.foo = 5;
kid.bar.resize(5);
kid.baz.resize(10);

if (rank == 0)
{
    MPI_Send(MPI_BOTTOM, 1, kid.mpi_dtype(), 1, 0, MPI_COMM_WORLD);
}

if (rank == 1)
{
    MPI_Recv(MPI_BOTTOM, 1, kid.mpi_dtype(), 0, 0, MPI_COMM_WORLD, NULL);
}

请注意使用MPI_BOTTOM作为缓冲区地址. MPI_BOTTOM指定地址空间的底部,在具有平坦地址空间的体系结构上为0.由于传递给MPI_Type_create_struct的偏移量是结构成员的绝对地址,因此当将这些偏移量添加到0时,结果再次是每个结构成员的绝对地址. Child::mpi_dtype()返回特定于该实例的延迟构造的MPI数据类型.

Notice the use of MPI_BOTTOM as the buffer address. MPI_BOTTOM specifies the bottom of the address space, which is 0 on architectures with flat address space. Since the offsets passed to MPI_Type_create_struct are the absolute addresses of the structure members, when those are added to 0, the result is again the absolute address of each structure member. Child::mpi_dtype() returns a lazily constructed MPI datatype specific to that instance.

由于resize()重新分配内存,这可能导致数据移动到内存中的其他位置,因此invalidate_dtype()方法应用于强制在resize()或任何其他操作之后重新生成MPI数据类型.可能会触发内存重新分配:

Since resize() reallocates memory, which could result in the data being moved to a different location in memory, the invalidate_dtype() method should be used to force the recreation of the MPI datatype after resize() or any other operation that might trigger memory reallocation:

// ...
kid.bar.resize(100);
kid.invalidate_dtype();
// MPI_Send / MPI_Recv

请原谅上面任何草率的C ++代码.

Please excuse any sloppy C++ code above.

这篇关于发送具有std :: vector成员的struct时出现分段错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆