使用MPI-IO和笛卡尔拓扑结构编写分布式阵列 [英] Writing distributed arrays using MPI-IO and Cartesian topology
问题描述
我有一个MPI代码,该代码实现2D域分解以计算PDE的数值解.目前,我为每个进程写出某些2D分布式数组(例如array_x-> proc000x.bin).我想将其减少为单个二进制文件.
I have an MPI code that implements 2D domain decomposition to compute numerical solutions to a PDE. Currently I write certain 2D distributed arrays out for each process (e.g. array_x--> proc000x.bin). I want to reduce that to a single binary file.
array_0,array_1,
array_0, array_1,
array_2,array_3,
array_2, array_3,
假设以上示例说明了具有4个进程(2x2)的笛卡尔拓扑.每个2D数组的尺寸为(nx + 2,nz + 2). +2表示为通信目的而添加到所有方面的幽灵"层.
Suppose the above illustrates a cartesian topology with 4 processes (2x2). Each 2D array has dimension (nx + 2, nz + 2). The +2 signifies "ghost" layers added to all sides for communication purposes.
我想提取主数组(忽略幻影层),并以类似
I would like to extract the main arrays (omit the ghost layers) and write them to a single binary file with an order something like,
array_0,array_1,array_2,array_3-> output.bin
array_0, array_1, array_2, array_3 --> output.bin
如果可能的话,最好将其编写为好像我可以访问全局网格并逐行编写,即
If possible it would be preferable to write it as though I had access to the global grid and was writing row-by-row i.e.,
array_0的行0,array_1的行0,array_0的行1,array_1的行_1 ....
row 0 of array_0, row 0 of array_1, row 1 of array_0 row_1 of array_1 ....
下面的尝试尝试文件array_test.c中两种输出格式中的前一种格式
The attempt below attempts the former of the two output formats in file array_test.c
#include <stdio.h>
#include <mpi.h>
#include <stdlib.h>
/* 2D array allocation */
float **alloc2D(int rows, int cols);
float **alloc2D(int rows, int cols) {
int i, j;
float *data = malloc(rows * cols * sizeof(float));
float **arr2D = malloc(rows * sizeof(float *));
for (i = 0; i < rows; i++) {
arr2D[i] = &(data[i * cols]);
}
/* Initialize to zero */
for (i= 0; i < rows; i++) {
for (j=0; j < cols; j++) {
arr2D[i][j] = 0.0;
}
}
return arr2D;
}
int main(void) {
/* Creates 5x5 array of floats with padding layers and
* attempts to write distributed arrays */
/* Run toy example with 4 processes */
int i, j, row, col;
int nx = 5, ny = 5, npad = 1;
int my_rank, nproc=4;
int dim[2] = {2, 2}; /* 2x2 cartesian grid */
int period[2] = {0, 0};
int coord[2];
int reorder = 1;
float **A = NULL;
MPI_Comm grid_Comm;
/* Initialize MPI */
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
/* Establish cartesian topology */
MPI_Cart_create(MPI_COMM_WORLD, 2, dim, period, reorder, &grid_Comm);
/* Get cartesian grid indicies of processes */
MPI_Cart_coords(grid_Comm, my_rank, 2, coord);
row = coord[1];
col = coord[0];
/* Add ghost layers */
nx += 2 * npad;
ny += 2 * npad;
A = alloc2D(nx, ny);
/* Create derived datatype for interior grid (output grid) */
MPI_Datatype grid;
int start[2] = {npad, npad};
int arrsize[2] = {nx, ny};
int gridsize[2] = {nx - 2 * npad, ny - 2 * npad};
MPI_Type_create_subarray(2, arrsize, gridsize,
start, MPI_ORDER_C, MPI_FLOAT, &grid);
MPI_Type_commit(&grid);
/* Fill interior grid */
for (i = npad; i < nx-npad; i++) {
for (j = npad; j < ny-npad; j++) {
A[i][j] = my_rank + i;
}
}
/* MPI IO */
MPI_File fh;
MPI_Status status;
char file_name[100];
int N, offset;
sprintf(file_name, "output.bin");
MPI_File_open(grid_Comm, file_name, MPI_MODE_CREATE | MPI_MODE_WRONLY,
MPI_INFO_NULL, &fh);
N = (nx - 2 * npad) * (ny - 2 *npad);
offset = (row * 2 + col) * N * sizeof(float);
MPI_File_set_view(fh, offset, MPI_FLOAT, grid, "native",
MPI_INFO_NULL);
MPI_File_write_all(fh, &A[0][0], N, MPI_FLOAT, MPI_STATUS_IGNORE);
MPI_File_close(&fh);
/* Cleanup */
free(A[0]);
free(A);
MPI_Type_free(&grid);
MPI_Finalize();
return 0;
}
使用
mpicc -o array_test array_test.c
运行
mpiexec -n 4 array_test
在编译并运行代码时,输出不正确.我假设在这种情况下我误解了派生数据类型和文件写入的用法.我希望能帮助您解决我的错误.
While the code compiles and runs, the output is incorrect. I'm assuming that I have misinterpreted the use of the derived datatype and file writing in this case. I'd appreciate some help figuring out my mistakes.
推荐答案
您在这里犯的错误是您使用了错误的文件视图.您不必使用代表要写入的本地数据的掩码来创建代表当前处理器负责的文件共享的类型.
The error you make here is that you have the wrong file view. Instead of creating a type representing the share of the file the current processor is responsible of, you use the mask corresponding to the local data you want to write.
您实际上要考虑两个非常不同的面具:
You have actually two very distinct masks to consider:
- 用于本地数据的掩码,不包括光晕层;和
- 全局数据的掩码,因为应该将其一次整理到文件中.
前者对应于此布局:
在这里,要为给定进程在文件上输出的数据以深蓝色显示,而不应写入文件的光晕层则以浅蓝色显示.
The former corresponds to this layout:
Here, the data that you want to output on the file for a given process in in dark blue, and the halo layer that should not be written on the file is in lighter blue.
后者对应于此布局:
在这里,每种颜色都对应于来自不同过程的本地数据,分布在2D笛卡尔网格上.
The later corresponds to this layout:
Here, each colour corresponds to the local data coming from a different process, as distributed on the 2D Cartesian grid.
要了解要达到最终结果需要创建什么,必须回想一下:
To understand what you need to create to reach this final result, you have to think backwards:
- 您对IO例程的最终调用应为
MPI_File_write_all(fh, &A[0][0], 1, interior, MPI_STATUS_IGNORE);
.因此,您必须定义您的interior
类型,以便排除光晕边界.幸运的是,您创建的grid
类型已经做到了.因此,我们将使用它. - 但是现在,您必须在文件上具有视图,才能进行此
MPI_Fie_write_all()
调用.因此,视图必须如第二张图片中所述.因此,我们将创建一个表示它的新的MPI类型.为此,我们需要MPI_Type_create_subarray()
.
- Your final call to the IO routine should be
MPI_File_write_all(fh, &A[0][0], 1, interior, MPI_STATUS_IGNORE);
. So you have to have yourinterior
type defined such as to exclude the halo boundary. Well fortunately, the typegrid
you created already does exactly that. So we will use it. - But now, you have to have the view on the file to allow for this
MPI_Fie_write_all()
call. So the view must be as described in the second picture. We will therefore create a new MPI type representing it. For that,MPI_Type_create_subarray()
is what we need.
以下是此功能的简介:
int MPI_Type_create_subarray(int ndims,
const int array_of_sizes[],
const int array_of_subsizes[],
const int array_of_starts[],
int order,
MPI_Datatype oldtype,
MPI_Datatype *newtype)
Create a datatype for a subarray of a regular, multidimensional array
INPUT PARAMETERS
ndims - number of array dimensions (positive integer)
array_of_sizes
- number of elements of type oldtype in each
dimension of the full array (array of positive integers)
array_of_subsizes
- number of elements of type oldtype in each dimension of
the subarray (array of positive integers)
array_of_starts
- starting coordinates of the subarray in each dimension
(array of nonnegative integers)
order - array storage order flag (state)
oldtype - array element datatype (handle)
OUTPUT PARAMETERS
newtype - new datatype (handle)
对于我们的2D笛卡尔文件视图,以下是这些输入参数所需的条件:
For our 2D Cartesian file view, here are what we need for these input parameters:
-
ndims
:2,因为网格是2D -
array_of_sizes
:这些是要输出的全局数组的维度,即{ nnx*dim[0], nny*dim[1] }
-
array_of_subsizes
:这些是要输出的数据本地份额的维度,即{ nnx, nny }
-
array_of_start
:这是本地共享在全局网格中的x,y起始坐标,即{ nnx*coord[0], nny*coord[1] }
-
order
:顺序为C,因此必须为MPI_ORDER_C
-
oldtype
:数据为float
,因此必须为MPI_FLOAT
ndims
: 2 as the grid is 2Darray_of_sizes
: these are the dimensions of the global array to output, namely{ nnx*dim[0], nny*dim[1] }
array_of_subsizes
: these are the dimensions of the local share of the data to output, namely{ nnx, nny }
array_of_start
: these are the x,y start coordinates of the local share into the global grid, namely{ nnx*coord[0], nny*coord[1] }
order
: the ordering is C so this must beMPI_ORDER_C
oldtype
: data arefloat
s so this must beMPI_FLOAT
现在我们有了文件视图的类型,我们只需使用MPI_File_set_view(fh, 0, MPI_FLOAT, view, "native", MPI_INFO_NULL);
进行应用,魔术就完成了.
Now that we have our type for the file view, we simply apply it with MPI_File_set_view(fh, 0, MPI_FLOAT, view, "native", MPI_INFO_NULL);
and the magic is done.
您的完整代码为:
#include <stdio.h>
#include <mpi.h>
#include <stdlib.h>
/* 2D array allocation */
float **alloc2D(int rows, int cols);
float **alloc2D(int rows, int cols) {
int i, j;
float *data = malloc(rows * cols * sizeof(float));
float **arr2D = malloc(rows * sizeof(float *));
for (i = 0; i < rows; i++) {
arr2D[i] = &(data[i * cols]);
}
/* Initialize to zero */
for (i= 0; i < rows; i++) {
for (j=0; j < cols; j++) {
arr2D[i][j] = 0.0;
}
}
return arr2D;
}
int main(void) {
/* Creates 5x5 array of floats with padding layers and
* attempts to write distributed arrays */
/* Run toy example with 4 processes */
int i, j, row, col;
int nx = 5, ny = 5, npad = 1;
int my_rank, nproc=4;
int dim[2] = {2, 2}; /* 2x2 cartesian grid */
int period[2] = {0, 0};
int coord[2];
int reorder = 1;
float **A = NULL;
MPI_Comm grid_Comm;
/* Initialize MPI */
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
/* Establish cartesian topology */
MPI_Cart_create(MPI_COMM_WORLD, 2, dim, period, reorder, &grid_Comm);
/* Get cartesian grid indicies of processes */
MPI_Cart_coords(grid_Comm, my_rank, 2, coord);
row = coord[1];
col = coord[0];
/* Add ghost layers */
nx += 2 * npad;
ny += 2 * npad;
A = alloc2D(nx, ny);
/* Create derived datatype for interior grid (output grid) */
MPI_Datatype grid;
int start[2] = {npad, npad};
int arrsize[2] = {nx, ny};
int gridsize[2] = {nx - 2 * npad, ny - 2 * npad};
MPI_Type_create_subarray(2, arrsize, gridsize,
start, MPI_ORDER_C, MPI_FLOAT, &grid);
MPI_Type_commit(&grid);
/* Fill interior grid */
for (i = npad; i < nx-npad; i++) {
for (j = npad; j < ny-npad; j++) {
A[i][j] = my_rank + i;
}
}
/* Create derived type for file view */
MPI_Datatype view;
int nnx = nx-2*npad, nny = ny-2*npad;
int startV[2] = { coord[0]*nnx, coord[1]*nny };
int arrsizeV[2] = { dim[0]*nnx, dim[1]*nny };
int gridsizeV[2] = { nnx, nny };
MPI_Type_create_subarray(2, arrsizeV, gridsizeV,
startV, MPI_ORDER_C, MPI_FLOAT, &view);
MPI_Type_commit(&view);
/* MPI IO */
MPI_File fh;
MPI_File_open(grid_Comm, "output.bin", MPI_MODE_CREATE | MPI_MODE_WRONLY,
MPI_INFO_NULL, &fh);
MPI_File_set_view(fh, 0, MPI_FLOAT, view, "native", MPI_INFO_NULL);
MPI_File_write_all(fh, &A[0][0], 1, grid, MPI_STATUS_IGNORE);
MPI_File_close(&fh);
/* Cleanup */
free(A[0]);
free(A);
MPI_Type_free(&view);
MPI_Type_free(&grid);
MPI_Finalize();
return 0;
}
这篇关于使用MPI-IO和笛卡尔拓扑结构编写分布式阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!