使用mpi将矩阵写入单个txt文件 [英] writing a matrix into a single txt file with mpi

查看:393
本文介绍了使用mpi将矩阵写入单个txt文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个巨大的矩阵,我把它分成一些子矩阵,我做一些计算。在这些计算之后,我必须将该矩阵写入单个文件以进行后处理。是否可以将结果写入单个文本文件,我该如何做?
例如,我们有一个在y方向上分割的nx ny矩阵(每个进程都有一个nx 秩矩阵),我们想将nx * ny矩阵写入单个文本文件。

I have a huge matrix that I divided it into some sub matrices and I make some computation on it. After those computations I have to write that matrix into a single file for post processing. Is it possible to write results into a single text file and how can I do it? For example we have a nxny matrix that is divided in y direction (each processes has a nxrank matrix) and we want to write the nx*ny matrix into a single text file.

推荐答案

因此,将大量数据作为文本写入并不是一个好主意。它真的,真的,慢,它生成不必要的大文件,这是一个痛苦的处理。大量的数据应该写成二进制,只有人类的摘要数据写成文本。让计算机处理的东西对于计算机来说很容易,只有那些你实际上要坐下来阅读的东西,你可以很容易处理(例如,文本)。

So it's not a good idea to write large amounts of data as text. It's really, really, slow, it generates unnecessarily large files, and it's a pain to deal with. Large amounts of data should be written as binary, with only summary data for humans written as text. Make the stuff the computer is going to deal with easy for the computer, and only the stuff you're actually going to sit down and read easy for you to deal with (eg, text).

无论您要以文本还是二进制的方式编写,都可以使用MPI-IO协调您的输出到文件以生成一个大文件。我们有一个关于这个主题的教程(使用MPI-IO,HDF5和NetCDF)这里。对于MPI-IO,诀窍是定义一个类型(这里是一个子数组),根据文件的全局布局描述数据的本地布局,然后使用它作为视图写入文件。每个文件只看到自己的视图,MPI-IO库协调输出,所以只要视图不重叠,一切都作为一个大文件。

Whether you're going to write as text or binary, you can use MPI-IO to coordinate your output to the file to generate one large file. We have a little tutorial on the topic (using MPI-IO, HDF5, and NetCDF) here. For MPI-IO, the trick is to define a type (here, a subarray) to describe the local layout of data in terms of the global layout of the file, and then write to the file using that as the "view". Each file sees only its own view, and the MPI-IO library coordinates the output so that as long as the views are non-overlapping, everything comes out as one big file.

如果我们以二进制的形式写出来,我们只需要将MPI_Write指向我们的数据,然后使用它;因为我们使用文本,我们必须将数据转换为字符串。我们以我们通常的方式定义数组,除了它是MPI_FLOATs之外,它是一个新类型,每个数字 charspernum 个字符。

If we were writing this out in binary, we'd just point MPI_Write to our data and be done with it; since we're using text, we have to convert out data into a string. We define our array the way we normally would have, except instead of it being of MPI_FLOATs, it's of a new type which is charspernum characters per number.

代码如下:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <mpi.h>

float **alloc2d(int n, int m) {
    float *data = malloc(n*m*sizeof(float));
    float **array = malloc(n*sizeof(float *));
    for (int i=0; i<n; i++)
        array[i] = &(data[i*m]);
    return array;
}

int main(int argc, char **argv) {
    int ierr, rank, size;
    MPI_Offset offset;
    MPI_File   file;
    MPI_Status status;
    MPI_Datatype num_as_string;
    MPI_Datatype localarray;
    const int nrows=10;
    const int ncols=10;
    float **data;
    char *const fmt="%8.3f ";
    char *const endfmt="%8.3f\n";
    int startrow, endrow, locnrows;

    const int charspernum=9;

    ierr = MPI_Init(&argc, &argv);
    ierr|= MPI_Comm_size(MPI_COMM_WORLD, &size);
    ierr|= MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    locnrows = nrows/size;
    startrow = rank * locnrows;
    endrow = startrow + locnrows - 1;
    if (rank == size-1) {
        endrow = nrows - 1;
        locnrows = endrow - startrow + 1;
    }

    /* allocate local data */
    data = alloc2d(locnrows, ncols);

    /* fill local data */
    for (int i=0; i<locnrows; i++) 
        for (int j=0; j<ncols; j++)
            data[i][j] = rank;

    /* each number is represented by charspernum chars */
    MPI_Type_contiguous(charspernum, MPI_CHAR, &num_as_string); 
    MPI_Type_commit(&num_as_string); 

    /* convert our data into txt */
    char *data_as_txt = malloc(locnrows*ncols*charspernum*sizeof(char));
    int count = 0;
    for (int i=0; i<locnrows; i++) {
        for (int j=0; j<ncols-1; j++) {
            sprintf(&data_as_txt[count*charspernum], fmt, data[i][j]);
            count++;
        }
        sprintf(&data_as_txt[count*charspernum], endfmt, data[i][ncols-1]);
        count++;
    }

    printf("%d: %s\n", rank, data_as_txt);

    /* create a type describing our piece of the array */
    int globalsizes[2] = {nrows, ncols};
    int localsizes [2] = {locnrows, ncols};
    int starts[2]      = {startrow, 0};
    int order          = MPI_ORDER_C;

    MPI_Type_create_subarray(2, globalsizes, localsizes, starts, order, num_as_string, &localarray);
    MPI_Type_commit(&localarray);

    /* open the file, and set the view */
    MPI_File_open(MPI_COMM_WORLD, "all-data.txt", 
                  MPI_MODE_CREATE|MPI_MODE_WRONLY,
                  MPI_INFO_NULL, &file);

    MPI_File_set_view(file, 0,  MPI_CHAR, localarray, 
                           "native", MPI_INFO_NULL);

    MPI_File_write_all(file, data_as_txt, locnrows*ncols, num_as_string, &status);
    MPI_File_close(&file);

    MPI_Type_free(&localarray);
    MPI_Type_free(&num_as_string);

    free(data[0]);
    free(data);

    MPI_Finalize();
    return 0;
}

运行时:

$ mpicc -o matrixastxt matrixastxt.c  -std=c99
$ mpirun -np 4 ./matrixastxt
$ more all-data.txt 
   0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000
   0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000
   1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000
   1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000
   2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000
   2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000
   3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000
   3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000
   3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000
   3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000

这篇关于使用mpi将矩阵写入单个txt文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆