Fortran函数用于在派生类型与可分配组件之间过载乘法 [英] Fortran function to overload multiplication between derived types with allocatable components

查看:188
本文介绍了Fortran函数用于在派生类型与可分配组件之间过载乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了存储带状矩阵,整个对应的矩阵可以同时具有从<$ c $以外的索引索引的行和列c> 1 ,我定义了一个派生数据类型为

  TYPE CDS 
REAL, DIMENSION(:, :),ALLOCATABLE :: matrix
INTEGER,DIMENSION(2):: lb,ub
INTEGER :: ld,ud
结束类型CDS

其中CDS代表压缩的对角线存储。
给定声明 TYPE(CDS):: A


  1. rank-2组件 matrix 应该包含实际完整矩阵的对角线(如这里,除了我将对角线存储为列而不是行)。

  2. 组件 ld ud 应该分别包含下对角线和上对角线的数量,即 -lbound(A%matrix,2) + ubound(A%matrix,2)

  3. 2元素组件 lb ub 应该包含实际的下限和上限整个矩阵沿着两个维度。特别是 lb(1) ub(1)应该与 lbound( A%matrix,1) lbound(A%matrix,2)

正如您在第2点和第3点中所看到的那样,派生类型包含一些冗余信息,但我不在乎,因为它们只是3对整数。此外,在我写的代码中,关于实际完整矩阵的边界和频带的信息在矩阵可以被填充之前就已知。因此,我首先将值分配给组件 ld ud lb ub ,然后我将这些组件用于 ALLOCATE 矩阵组件(然后我可以正确填写它)。

问题



我必须在这些稀疏矩阵之间进行矩阵乘法,因此我编写了一个 FUNCTION 来执行此类乘积并用它来重载 * 运营商。



目前该函数如下所示:

pre > FUNCTION CDS_mat_x_CDS_mat(A,B)
IMPLICIT NONE
TYPE(CDS),INTENT(IN):: A,B
TYPE(CDS):: cds_mat_x_cds_mat

! (1)= A%ub(1)确定结果的下限和上限以及基于操作数的结果的上限和下限:
CDS_mat_x_CDS_mat%lb(1)= A%lb(1)
CDS_mat_x_CDS_mat% )
CDS_mat_x_CDS_mat%lb(2)= B%lb(2)
CDS_mat_x_CDS_mat%ub(2)= B%ub(2)
CDS_mat_x_CDS_mat%ld = A%ld + B%ld
CDS_mat_x_CDS_mat%ud = A%ud + B%ud

!分配矩阵分量
ALLOCATE(CDS_mat_x_CDS_mat%矩阵(CDS_mat_x_CDS_mat%lb(1):CDS_mat_x_CDS_mat%ub(1),&
& -CDS_mat_x_CDS_mat%ld:+ CDS_mat_x_CDS_mat%ud))

!执行产品



END FUNCTION



<这意味着,如果我必须多次执行该产品,那么在该函数内多次执行分配内部。我认为这从性能的角度来看并不好。



对于如何完成带状稀疏矩阵时间带状稀疏矩阵的任务,我提出了一些建议。我想使用我定义的类型,因为我需要它像现在一样在边界方面一般。但是,如果需要,我可以更改执行产品的过程(从 FUNCTION SUBROUTINE )。

想法

我可以将程序改写为 SUBROUTINE 为了用 INTENT(INOUT)声明 CDS_mat_x_CDS_mat ,分配组件除了 matrix 以外,还有 SUBROUTINE 之外的分配。缺点是我不能重载 * 运算符。



我注意到了内部函数 MATMUL 可以在任何rank-2操作数上运行,无论在两个维度上的上限还是下限。这意味着分配在函数内部执行。我认为它是有效的(因为它是一个内在的)。与我的函数的不同之处在于它接受任意形状的rank-2数组,我的接受派生数据类型对象具有任何形状的rank-2数组组件。


因此,没有Fortran语言ALLOCATABLE变量与内部MATMUL函数结果的等效关联。这与没有内存分配不同 - 编译器可能仍然需要为了自己的目的在场景后面分配内存 - 例如表达式临时对象等。



(我说上面的相当于,因为内在过程本质上是特殊的 - 但假装MATMUL只是一个用户函数。)

通过使用长度类型参数,可以实现与您的情况相同的这种自动结果。这是Fortran 2003的特性 - 引入了可分配组件的相同基本语言标准 - 但它并不是所有主动维护的编译器都已实现的。

  MODULE xyz 
IMPLICIT NONE

TYPE CDS(lb1,ub1,ld,ud)
INTEGER,LEN :: lb1,ub1,ld,ud
REAL ::矩阵(lb1:ub1,ld:ud)
INTEGER :: lb2,ub2
结束类型CDS

接口操作符(*)
模块过程CDS_mat_x_CDS_mat
END INTERFACE OPERATOR(*)
包含
函数CDS_mat_x_CDS_mat(A,B)结果(C)
类型(CDS(*,*,*,*)) ,INTENT(IN):: A,B
TYPE(CDS(A%lb1,A%ub1,A%ld + B%ld,A%ud + B%ud)):: C

C%lb2 = B%lb2
C%ub2 = B%ub2

!执行产品。
! :

END FUNCTION CDS_mat_x_CDS_mat
END MODULE xyz

从理论上讲,为编译器提供了更多优化机会,因为它在函数结果所需的存储函数调用之前具有更多的洞察力。这实际上是否会带来更好的真实世界性能取决于编译器的实现和函数引用的性质。


Foreword

In order to store banded matrices whose full counterparts can have both rows and columns indexed from indices other than 1, I defined a derived data type as

TYPE CDS
  REAL, DIMENSION(:,:), ALLOCATABLE :: matrix
  INTEGER, DIMENSION(2) :: lb, ub
  INTEGER :: ld, ud
END TYPE CDS

where CDS stands for compressed diagonal storage. Given the declaration TYPE(CDS) :: A,

  1. The rank-2 component matrix is supposed to contain, as columns, the diagonals of the actual full matrix (like here, except that I store the diagonals as columns and not as rows).
  2. The components ld and ud are supposed to contain the number of lower and upper diagonals respectively, that is -lbound(A%matrix,2) and +ubound(A%matrix,2).
  3. The 2-elements components lb and ub are supposed to contain the lower bounds and upper bounds of the actual full matrix along the two dimensions. In particular lb(1) and ub(1) should be the same as lbound(A%matrix,1) and lbound(A%matrix,2).

As you can see in points 2. and 3., the derived type contains some redundant information, but I don't care, since they are just 3 couples of integers. Furthermore, in the code that I'm writing, the information about the bounds and the band of the actual full matrix is know before the matrix can be filled. So I first assign the values to the components ld, ud, lb and ub, and then I used these components to ALLOCATE the matrix component (then I can fill it properly).

The problem

I have to perform matrix multiplication between such sparse matrices, so I wrote a FUNCTION to perform such product and used it to overload the * operator.

At the moment the function is as follows,

FUNCTION CDS_mat_x_CDS_mat(A, B)
IMPLICIT NONE
TYPE(CDS), INTENT(IN) :: A, B
TYPE(CDS) :: cds_mat_x_cds_mat

! determine the lower and upper bounds and band of the result based on those of the operands
CDS_mat_x_CDS_mat%lb(1) = A%lb(1)
CDS_mat_x_CDS_mat%ub(1) = A%ub(1)
CDS_mat_x_CDS_mat%lb(2) = B%lb(2)
CDS_mat_x_CDS_mat%ub(2) = B%ub(2)
CDS_mat_x_CDS_mat%ld = A%ld + B%ld
CDS_mat_x_CDS_mat%ud = A%ud + B%ud

! allocate the matrix component
ALLOCATE(CDS_mat_x_CDS_mat%matrix(CDS_mat_x_CDS_mat%lb(1):CDS_mat_x_CDS_mat%ub(1),&
                               & -CDS_mat_x_CDS_mat%ld:+CDS_mat_x_CDS_mat%ud))

! perform the product
:
:

END FUNCTION

This means that, if I have to do the product multiple times the allocation is done multiple times inside the function. I think this is not good from a performance point of view.

I ask for suggestions on how to accomplish the task of banded sparse matrix times banded sparse matrix. I would like to use the type the I defined because I need it to be as general, in terms of bounds, as it is at the moment. But I could change the procedure to perform the product (from FUNCTION to SUBROUTINE, if necessary).

Ideas

I could rewrite the procedure as a SUBROUTINE in order to declare CDS_mat_x_CDS_mat with INTENT(INOUT) do the assignment of components other than matrix, as well as the allocation, outside the SUBROUTINE. The drawback would be that I could not overload the * operator.

I noticed the intrinsic function MATMUL can operate on any rank-2 operands, whichever the upper and lower bounds along the two dimensions. This means that the allocation is performed inside the function. I suppose it's efficient (since it is an intrinsic). The difference with respect to my function is that it accepts rank-2 arrays of any shape, wheres mine accept derived data types objects with a rank-2 array component of any shape.

解决方案

The intrinsic function MATMUL has the equivalent of an automatic (F2008 5.2.2) result - the shape of the result is expressed in such a way that it becomes a characteristic of the function (F2008 12.3.3) - the shape of the function result is determined in the specification part of the function and (in terms of implementation) the compiler therefore knows how to calculate the shape of the function result ahead of actually executing the function proper.

As a consequence there are no Fortran language ALLOCATABLE variables associated with the equivalent of the intrinsic MATMUL function result. This is not the same thing as there being "no memory allocation" - the compiler may still need to allocate memory behind the scenes for its own purposes - for things like expressions temporaries and the like.

(I say "equivalent of" above, because intrinsic procedures are inherently special - but pretend for a moment that MATMUL was just a user function.)

The equivalent of that sort of automatic result for your case can be achieved by using length type parameters. This is Fortran 2003 feature - the same base language standard that introduced allocatable components - but it is not one that has yet been implemented by all actively maintained compilers.

MODULE xyz
  IMPLICIT NONE

  TYPE CDS(lb1, ub1, ld, ud)
    INTEGER, LEN :: lb1, ub1, ld, ud
    REAL :: matrix(lb1:ub1, ld:ud)
    INTEGER :: lb2, ub2
  END TYPE CDS

  INTERFACE OPERATOR(*)
    MODULE PROCEDURE CDS_mat_x_CDS_mat
  END INTERFACE OPERATOR(*)
CONTAINS
  FUNCTION CDS_mat_x_CDS_mat(A, B) RESULT(C)
    TYPE(CDS(*,*,*,*)), INTENT(IN) :: A, B
    TYPE(CDS(A%lb1, A%ub1, A%ld+B%ld, A%ud+B%ud)) :: C

    C%lb2 = B%lb2
    C%ub2 = B%ub2

    ! perform the product.
    ! :

  END FUNCTION CDS_mat_x_CDS_mat
END MODULE xyz

Theoretically this gives the compiler more opportunity for optimisation, because it has more insight ahead of the call to the function for the storage required for the function result. Whether this actually results in better real world performance depends on the implementation of the compiler and the nature of the function references.

这篇关于Fortran函数用于在派生类型与可分配组件之间过载乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆