使用KSP指南的PETSc求解线性系统 [英] PETSc solving linear system with ksp guide
问题描述
我开始使用PETSc库并行求解线性方程组.我已经安装了所有软件包,并在petsc/src/ksp/ksp/examples/tutorials/文件夹中成功构建并运行了示例,例如ex.c
I am starting use PETSc library to solve linear system of equations in parallel. I have installed all packages, build and run successfully the examples in petsc/src/ksp/ksp/examples/tutorials/ folder, for example ex.c
但是我无法理解如何通过从文件中读取矩阵来填充矩阵A,X和B.
But I couldn't understand how to fill matrices A,X an B by reading them for example from file.
在这里,我在ex2.c文件中提供了代码:
Here I provide the code within ex2.c file:
/* Program usage: mpiexec -n <procs> ex2 [-help] [all PETSc options] */
static char help[] = "Solves a linear system in parallel with KSP.\n\
Input parameters include:\n\
-random_exact_sol : use a random exact solution vector\n\
-view_exact_sol : write exact solution vector to stdout\n\
-m <mesh_x> : number of mesh points in x-direction\n\
-n <mesh_n> : number of mesh points in y-direction\n\n";
/*T
Concepts: KSP^basic parallel example;
Concepts: KSP^Laplacian, 2d
Concepts: Laplacian, 2d
Processors: n
T*/
/*
Include "petscksp.h" so that we can use KSP solvers. Note that this file
automatically includes:
petscsys.h - base PETSc routines petscvec.h - vectors
petscmat.h - matrices
petscis.h - index sets petscksp.h - Krylov subspace methods
petscviewer.h - viewers petscpc.h - preconditioners
*/
#include <C:\PETSC\include\petscksp.h>
#undef __FUNCT__
#define __FUNCT__ "main"
int main(int argc,char **args)
{
Vec x,b,u; /* approx solution, RHS, exact solution */
Mat A; /* linear system matrix */
KSP ksp; /* linear solver context */
PetscRandom rctx; /* random number generator context */
PetscReal norm; /* norm of solution error */
PetscInt i,j,Ii,J,Istart,Iend,m = 8,n = 7,its;
PetscErrorCode ierr;
PetscBool flg = PETSC_FALSE;
PetscScalar v;
#if defined(PETSC_USE_LOG)
PetscLogStage stage;
#endif
PetscInitialize(&argc,&args,(char *)0,help);
ierr = PetscOptionsGetInt(PETSC_NULL,"-m",&m,PETSC_NULL);CHKERRQ(ierr);
ierr = PetscOptionsGetInt(PETSC_NULL,"-n",&n,PETSC_NULL);CHKERRQ(ierr);
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Compute the matrix and right-hand-side vector that define
the linear system, Ax = b.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
/*
Create parallel matrix, specifying only its global dimensions.
When using MatCreate(), the matrix format can be specified at
runtime. Also, the parallel partitioning of the matrix is
determined by PETSc at runtime.
Performance tuning note: For problems of substantial size,
preallocation of matrix memory is crucial for attaining good
performance. See the matrix chapter of the users manual for details.
*/
ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr);
ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,m*n,m*n);CHKERRQ(ierr);
ierr = MatSetFromOptions(A);CHKERRQ(ierr);
ierr = MatMPIAIJSetPreallocation(A,5,PETSC_NULL,5,PETSC_NULL);CHKERRQ(ierr);
ierr = MatSeqAIJSetPreallocation(A,5,PETSC_NULL);CHKERRQ(ierr);
/*
Currently, all PETSc parallel matrix formats are partitioned by
contiguous chunks of rows across the processors. Determine which
rows of the matrix are locally owned.
*/
ierr = MatGetOwnershipRange(A,&Istart,&Iend);CHKERRQ(ierr);
/*
Set matrix elements for the 2-D, five-point stencil in parallel.
- Each processor needs to insert only elements that it owns
locally (but any non-local elements will be sent to the
appropriate processor during matrix assembly).
- Always specify global rows and columns of matrix entries.
Note: this uses the less common natural ordering that orders first
all the unknowns for x = h then for x = 2h etc; Hence you see J = Ii +- n
instead of J = I +- m as you might expect. The more standard ordering
would first do all variables for y = h, then y = 2h etc.
*/
ierr = PetscLogStageRegister("Assembly", &stage);CHKERRQ(ierr);
ierr = PetscLogStagePush(stage);CHKERRQ(ierr);
for (Ii=Istart; Ii<Iend; Ii++) {
v = -1.0; i = Ii/n; j = Ii - i*n;
if (i>0) {J = Ii - n; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);}
if (i<m-1) {J = Ii + n; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);}
if (j>0) {J = Ii - 1; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);}
if (j<n-1) {J = Ii + 1; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);}
v = 4.0; ierr = MatSetValues(A,1,&Ii,1,&Ii,&v,INSERT_VALUES);CHKERRQ(ierr);
}
/*
Assemble matrix, using the 2-step process:
MatAssemblyBegin(), MatAssemblyEnd()
Computations can be done while messages are in transition
by placing code between these two statements.
*/
ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
ierr = PetscLogStagePop();CHKERRQ(ierr);
/* A is symmetric. Set symmetric flag to enable ICC/Cholesky preconditioner */
ierr = MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE);CHKERRQ(ierr);
/*
Create parallel vectors.
- We form 1 vector from scratch and then duplicate as needed.
- When using VecCreate(), VecSetSizes and VecSetFromOptions()
in this example, we specify only the
vector's global dimension; the parallel partitioning is determined
at runtime.
- When solving a linear system, the vectors and matrices MUST
be partitioned accordingly. PETSc automatically generates
appropriately partitioned matrices and vectors when MatCreate()
and VecCreate() are used with the same communicator.
- The user can alternatively specify the local vector and matrix
dimensions when more sophisticated partitioning is needed
(replacing the PETSC_DECIDE argument in the VecSetSizes() statement
below).
*/
ierr = VecCreate(PETSC_COMM_WORLD,&u);CHKERRQ(ierr);
ierr = VecSetSizes(u,PETSC_DECIDE,m*n);CHKERRQ(ierr);
ierr = VecSetFromOptions(u);CHKERRQ(ierr);
ierr = VecDuplicate(u,&b);CHKERRQ(ierr);
ierr = VecDuplicate(b,&x);CHKERRQ(ierr);
/*
Set exact solution; then compute right-hand-side vector.
By default we use an exact solution of a vector with all
elements of 1.0; Alternatively, using the runtime option
-random_sol forms a solution vector with random components.
*/
ierr = PetscOptionsGetBool(PETSC_NULL,"-random_exact_sol",&flg,PETSC_NULL);CHKERRQ(ierr);
if (flg) {
ierr = PetscRandomCreate(PETSC_COMM_WORLD,&rctx);CHKERRQ(ierr);
ierr = PetscRandomSetFromOptions(rctx);CHKERRQ(ierr);
ierr = VecSetRandom(u,rctx);CHKERRQ(ierr);
ierr = PetscRandomDestroy(&rctx);CHKERRQ(ierr);
} else {
ierr = VecSet(u,1.0);CHKERRQ(ierr);
}
ierr = MatMult(A,u,b);CHKERRQ(ierr);
/*
View the exact solution vector if desired
*/
flg = PETSC_FALSE;
ierr = PetscOptionsGetBool(PETSC_NULL,"-view_exact_sol",&flg,PETSC_NULL);CHKERRQ(ierr);
if (flg) {ierr = VecView(u,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);}
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Create the linear solver and set various options
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
/*
Create linear solver context
*/
ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);
/*
Set operators. Here the matrix that defines the linear system
also serves as the preconditioning matrix.
*/
ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);
/*
Set linear solver defaults for this problem (optional).
- By extracting the KSP and PC contexts from the KSP context,
we can then directly call any KSP and PC routines to set
various options.
- The following two statements are optional; all of these
parameters could alternatively be specified at runtime via
KSPSetFromOptions(). All of these defaults can be
overridden at runtime, as indicated below.
*/
ierr = KSPSetTolerances(ksp,1.e-2/((m+1)*(n+1)),1.e-50,PETSC_DEFAULT,
PETSC_DEFAULT);CHKERRQ(ierr);
/*
Set runtime options, e.g.,
-ksp_type <type> -pc_type <type> -ksp_monitor -ksp_rtol <rtol>
These options will override those specified above as long as
KSPSetFromOptions() is called _after_ any other customization
routines.
*/
ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Solve the linear system
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Check solution and clean up
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
/*
Check the error
*/
ierr = VecAXPY(x,-1.0,u);CHKERRQ(ierr);
ierr = VecNorm(x,NORM_2,&norm);CHKERRQ(ierr);
ierr = KSPGetIterationNumber(ksp,&its);CHKERRQ(ierr);
/* Scale the norm */
/* norm *= sqrt(1.0/((m+1)*(n+1))); */
/*
Print convergence information. PetscPrintf() produces a single
print statement from all processes that share a communicator.
An alternative is PetscFPrintf(), which prints to a file.
*/
ierr = PetscPrintf(PETSC_COMM_WORLD,"Norm of error %A iterations %D\n",
norm,its);CHKERRQ(ierr);
/*
Free work space. All PETSc objects should be destroyed when they
are no longer needed.
*/
ierr = KSPDestroy(&ksp);CHKERRQ(ierr);
ierr = VecDestroy(&u);CHKERRQ(ierr); ierr = VecDestroy(&x);CHKERRQ(ierr);
ierr = VecDestroy(&b);CHKERRQ(ierr); ierr = MatDestroy(&A);CHKERRQ(ierr);
/*
Always call PetscFinalize() before exiting a program. This routine
- finalizes the PETSc libraries as well as MPI
- provides summary and diagnostic information if certain runtime
options are chosen (e.g., -log_summary).
*/
ierr = PetscFinalize();
return 0;
}
有人知道如何在示例中填充自己的矩阵吗?
Does someone know how to fill own matrices within examples?
推荐答案
是的,当您入门时,这可能会有些令人生畏. 此 ACTS 教程; PetSC网页上的教程通常非常好.
Yeah, this can be a little daunting when you're getting started. There's a good walk-through of the process in this ACTS tutorial from 2006; the tutorials listed on the PetSC web page are generally quite good.
其中的关键部分是:
ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr);
实际上创建PetSC矩阵对象Mat A
;
Actually create the PetSC matrix object, Mat A
;
ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,m*n,m*n);CHKERRQ(ierr);
设置尺寸;在这里,矩阵是m*n x m*n
,因为它是用于在m x n
2d网格上运行的模具
set the sizes; here, the matrix is m*n x m*n
, as it's a stencil for operating on an m x n
2d grid
ierr = MatSetFromOptions(A);CHKERRQ(ierr);
如果您想控制A的设置方式,它只需要您在运行时提供的所有PetSC命令行选项,并将它们应用于矩阵.否则,您可能只是使用了> c3> 将其创建为AIJ格式的矩阵(默认设置),
This just takes any PetSC command line options that you might have supplied at run time and apply them to the matrix, if you wanted to control how A was set up; otherwise, you could just, have eg, used MatCreateMPIAIJ()
to create it as an AIJ-format matrix (the default), MatCreateMPIDense() if it was going to be a dense matrix.
ierr = MatMPIAIJSetPreallocation(A,5,PETSC_NULL,5,PETSC_NULL);CHKERRQ(ierr);
ierr = MatSeqAIJSetPreallocation(A,5,PETSC_NULL);CHKERRQ(ierr);
现在我们已经获得了一个AIJ矩阵,这些调用只是预先分配了稀疏矩阵,假设每行有5个非零值.这是为了提高性能.请注意,必须同时调用MPI和Seq函数,以确保此方法可同时用于1个处理器和多个处理器.这似乎总是很奇怪,但是你去了.
Now that we've gotten an AIJ matrix, these calls just pre-allocates the sparse matrix, assuming 5 non-zeros per row. This is for performance. Note that both the MPI and Seq functions must be called to make sure this works with both 1 processor and multiple processors; this always seemed weird to be, but there you go.
好吧,既然矩阵已经设置好了,在这里我们开始研究问题的实质.
Ok, now that the matrix is all set up, here's where we start getting into the actual meat of the matter.
首先,我们找出该特定进程拥有哪些行.该分布是按行分布的,对于典型的稀疏矩阵而言,这是一个很好的分布.
First, we find out which rows this particular process owns. The distribution is by rows, which is a good distribution for typical sparse matrices.
ierr = MatGetOwnershipRange(A,&Istart,&Iend);CHKERRQ(ierr);
因此,在此调用之后,每个处理器都有自己的Istart和Iend版本,并且此处理器作业将更新以Istart结尾的行,该行以 end Iend结尾,如您在for循环中所见:
So after this call, each processor has its own version of Istart and Iend, and its this processors job to update rows starting at Istart end ending just before Iend, as you see in this for loop:
for (Ii=Istart; Ii<Iend; Ii++) {
v = -1.0; i = Ii/n; j = Ii - i*n;
好,因此,如果我们在行Ii
上进行操作,则这对应于网格位置(i,j)
,其中i = Ii/n
和j = Ii % n
.例如,网格位置(i,j)
对应于行Ii = i*n + j
.有道理吗?
Ok, so if we're operating on row Ii
, this corresponds to grid location (i,j)
where i = Ii/n
and j = Ii % n
. Eg, grid location (i,j)
corresponds to row Ii = i*n + j
. Makes sense?
在这里,我将剔除if语句,因为它们很重要,但它们仅处理边界值,它们会使事情变得更复杂.
I'm going to strip out the if statements here because they're important but they're just dealing with the boundary values and they make things more complicated.
在此行中,对角线上为+4,与(i-1,j)
,(i+1,j)
,(i,j-1)
和(i,j+1)
对应的列为-1s.假设我们没有针对这些问题(例如1 < i < m-1
和1 < j < n-1
)走到网格的尽头,那就意味着
In this row, there will be a +4 on the diagonal, and -1s at columns corresponding to (i-1,j)
, (i+1,j)
, (i,j-1)
, and (i,j+1)
. Assuming that we haven't gone off the end of the grid for these (eg, 1 < i < m-1
and 1 < j < n-1
), that means
J = Ii - n; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);
J = Ii + n; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);
J = Ii - 1; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);
J = Ii + 1; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);
v = 4.0; ierr = MatSetValues(A,1,&Ii,1,&Ii,&v,INSERT_VALUES);CHKERRQ(ierr);
}
我取出的if语句只是避免设置那些不存在的值,而CHKERRQ
宏只会在ierr != 0
时打印出一个有用的错误,例如,设置值调用失败(因为我们试图设置无效的值).
The if statements I took out just avoid setting those values if they don't exist, and the CHKERRQ
macro just prints out a useful error if ierr != 0
, eg the set values call failed (because we tried to set an invalid value).
现在,我们已经设置了本地值; MatAssembly
调用开始通信,以确保在处理器之间交换任何必要的值.如果您有任何无关的工作要做,可以将其卡在开始"和结束"之间,以尝试使通信和计算重叠:
Now we've set local values; the MatAssembly
calls start communication to ensure any necessary values are exchanged between processors. If you have any unrelated work to do, it can be stuck between the Begin and End to try to overlap communication and computation:
ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
现在您已完成,可以致电您的求解器.
And now you're done and can call your solvers.
所以典型的工作流程是:
So a typical workflow is:
- 创建矩阵(
MatCreate
) - 设置其大小(
MatSetSizes
) - 设置各种矩阵选项(
MatSetFromOptions
是一个不错的选择,而不是对事物进行硬编码) - 对于稀疏矩阵,将预分配设置为每行非零数目的合理猜测;您可以使用单个值(如此处),也可以使用表示每行非零数量的数组(此处用
PETSC_NULL
填充)来执行此操作:(MatMPIAIJSetPreallocation
,MatSeqAIJSetPreallocation
) - 找出您要负责的行:(
MatGetOwnershipRange
) - 设置值(每个值调用一次
MatSetValues
或传入一大堆值;INSERT_VALUES
设置新元素,ADD_VALUES
递增任何现有元素) - 然后进行组装(
MatAssemblyBegin
,MatAssemblyEnd
).
- Create your matrix (
MatCreate
) - Set its size (
MatSetSizes
) - Set various matrix options (
MatSetFromOptions
is a good choice, rather than hardcoding things) - For sparse matrices, set the preallocation to reasonable guesses for the number of non-zeros per row; you can do this with a single value (as here), or with an array representing the number of non-zeros per row (here filled in with
PETSC_NULL
): (MatMPIAIJSetPreallocation
,MatSeqAIJSetPreallocation
) - Find out which rows are your responsibility: (
MatGetOwnershipRange
) - Set the values (calling
MatSetValues
either once per value, or passing in a chunk of values;INSERT_VALUES
sets new elements,ADD_VALUES
increments any existing elements) - Then do the assembly (
MatAssemblyBegin
,MatAssemblyEnd
).
其他更复杂的用例也是可能的.
Other more complicated use cases are possible.
这篇关于使用KSP指南的PETSc求解线性系统的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!