为什么MPI_Barrier在C ++中导致分段错误 [英] Why does MPI_Barrier cause a segmentation fault in C++

查看:134
本文介绍了为什么MPI_Barrier在C ++中导致分段错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将程序简化为以下示例:

#include <mpi.h>

int main(int argc, char * argv[]) {
    int rank, size;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Barrier(MPI_COMM_WORLD);
    MPI_Finalize();
    return 0;
}

我编译并运行代码,并得到以下结果:

My-MacBook-Pro-2:xCode_TrapSim user$ mpicxx -g -O0 -Wall barrierTest.cpp -o barrierTestExec
My-MacBook-Pro-2:xCode_TrapSim user$ mpiexec -n 2 ./barrierTestExec

==================================================================================    =
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 21633 RUNNING AT My-MacBook-Pro-2.local
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault: 11 (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

如果我注释掉MPI_Barrier,或者仅在一个节点上运行该程序,则代码可以正常运行.我正在使用以下编译器:

My-MacBook-Pro-2:xCode_TrapSim user$ mpiexec --version
HYDRA build details:
Version:                                 3.2
Release Date:                            Wed Nov 11 22:06:48 CST 2015
CC:                              clang    
CXX:                             clang++    
F77:                             /usr/local/bin/gfortran   
F90:                             /usr/local/bin/gfortran   
Configure options:                       '--disable-option-checking' '--prefix=/usr/local/Cellar/mpich/3.2_1' '--disable-dependency-tracking' '--disable-silent-rules' '--mandir=/usr/local/Cellar/mpich/3.2_1/share/man' 'CC=clang' 'CXX=clang++' 'FC=/usr/local/bin/gfortran' 'F77=/usr/local/bin/gfortran' '--cache-file=/dev/null' '--srcdir=.' 'CFLAGS= -O2' 'LDFLAGS=' 'LIBS=-lpthread ' 'CPPFLAGS= -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/mpl/include -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/mpl/include -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/openpa/src -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/openpa/src -D_REENTRANT -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/mpi/romio/include'
Process Manager:                         pmi
Launchers available:                     ssh rsh fork slurm ll lsf sge manual persist
Topology libraries available:            hwloc
Resource management kernels available:   user slurm ll lsf sge pbs cobalt
Checkpointing libraries available:       
Demux engines available:                 poll select


My-MacBook-Pro-2:xCode_TrapSim user$ clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.5.0
Thread model: posix
InstalledDir:     /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

这似乎应该是一个小问题,但我似乎无法弄清楚.为什么MPI_Barrier会导致此简单代码出现段错误?

解决方案

很难确定您的安装出了什么问题.但是,如果可以使用任何一种MPI风格,则可以尝试以下一种:

http://www.owsiak.org/?p=3492

我只能说,它适用于Open MPI

~/opt/usr/local/bin/mpicxx -g -O0 -Wall barrierTestExec.cpp -o barrierTestExec
~/opt/usr/local/bin/mpiexec -n 2 ./barrierTestExec

在我的情况下也不例外.确实似乎是特定于环境的.

I have reduced my program to the following example:

#include <mpi.h>

int main(int argc, char * argv[]) {
    int rank, size;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Barrier(MPI_COMM_WORLD);
    MPI_Finalize();
    return 0;
}

I compile and run the code, and get the following result:

My-MacBook-Pro-2:xCode_TrapSim user$ mpicxx -g -O0 -Wall barrierTest.cpp -o barrierTestExec
My-MacBook-Pro-2:xCode_TrapSim user$ mpiexec -n 2 ./barrierTestExec

==================================================================================    =
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 21633 RUNNING AT My-MacBook-Pro-2.local
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault: 11 (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

If I comment out the MPI_Barrier, or run the program on only one node, the code runs fine. I am using the following compilers:

My-MacBook-Pro-2:xCode_TrapSim user$ mpiexec --version
HYDRA build details:
Version:                                 3.2
Release Date:                            Wed Nov 11 22:06:48 CST 2015
CC:                              clang    
CXX:                             clang++    
F77:                             /usr/local/bin/gfortran   
F90:                             /usr/local/bin/gfortran   
Configure options:                       '--disable-option-checking' '--prefix=/usr/local/Cellar/mpich/3.2_1' '--disable-dependency-tracking' '--disable-silent-rules' '--mandir=/usr/local/Cellar/mpich/3.2_1/share/man' 'CC=clang' 'CXX=clang++' 'FC=/usr/local/bin/gfortran' 'F77=/usr/local/bin/gfortran' '--cache-file=/dev/null' '--srcdir=.' 'CFLAGS= -O2' 'LDFLAGS=' 'LIBS=-lpthread ' 'CPPFLAGS= -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/mpl/include -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/mpl/include -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/openpa/src -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/openpa/src -D_REENTRANT -I/private/tmp/mpich-20160606-48824-1qsaqn8/mpich-3.2/src/mpi/romio/include'
Process Manager:                         pmi
Launchers available:                     ssh rsh fork slurm ll lsf sge manual persist
Topology libraries available:            hwloc
Resource management kernels available:   user slurm ll lsf sge pbs cobalt
Checkpointing libraries available:       
Demux engines available:                 poll select


My-MacBook-Pro-2:xCode_TrapSim user$ clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.5.0
Thread model: posix
InstalledDir:     /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

This seems like it should be a trivial problem, but I can't seem to figure it out. Why would MPI_Barrier cause this simple code to seg fault?

解决方案

It is supper hard to decide what is wrong with your installation. However, if you can use any of the MPI flavors, maybe you can try this one:

http://www.owsiak.org/?p=3492

All I can say, it works with Open MPI

~/opt/usr/local/bin/mpicxx -g -O0 -Wall barrierTestExec.cpp -o barrierTestExec
~/opt/usr/local/bin/mpiexec -n 2 ./barrierTestExec

and no exception in my case. It really seems to be environment specific.

这篇关于为什么MPI_Barrier在C ++中导致分段错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆