如何做一个fftw3 MPI“转置”如果可能的话,2D变换? [英] How to do a fftw3 MPI "transposed" 2D transform if possible at all?

查看:309
本文介绍了如何做一个fftw3 MPI“转置”如果可能的话,2D变换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑从复杂数组 src 转换为真实数组 tgt 的形式L x M(列主设置)的2D变换。或者,在Fortranese中,

  complex(C_DOUBLE_COMPLEX),pointer :: src(:, :) 
real(8 ),指针:: tgt(:, :)。

对应的指针是

  type(C_PTR):: csrc,ctgt。 

我会按以下方式分配它们:

 !复数组首先
alloc_local = fftw_mpi_local_size_2d(M,L / 2 + 1,MPI_COMM_WORLD,local_M,local_offset1)
csrc = fftw_alloc_complex(alloc_local)
调用c_f_pointer(csrc,src,[L / 2,local_M])

!现在真实数组
alloc_local = fftw_mpi_local_size_2d(2 *(L / 2 + 1),M,&
MPI_COMM_WORLD,local_L,local_offset2)
ctgt = fftw_alloc_real(alloc_local)
调用c_f_pointer(ctgt,tgt,[M,local_L])

现在, :

 !创建具有一个转置的c→r转换省略
plan = fftw_mpi_plan_dft_c2r_2d(M,L,src,tgt,MPI_COMM_WORLD,&
ior(FFTW_MEASURE,FFTW_MPI_TRANSPOSED_OUT))

最后,转换将按照以下方式执行:

  call fftw_mpi_execute_dft_c2r(plan,src,tgt)

然而,这个处方没有工作。最后的呼叫导致分段错误。起初,我认为这可能与我如何分配 src tgt 数组有关,但使用分配给 tgt 没有给出任何结果。所以,我要么在做一些非常愚蠢的事情,要么根本不可能做到。

$ b

 程序trashingfftw 
use,intrinsic :: iso_c_binding
使用MPI

隐式无
include'fftw3- mpi.f03'

整数(C_INTPTR_T),参数:: L = 256
整数(C_INTPTR_T),参数:: M = 256

类型(C_PTR) :: plan,ctgt,csrc

complex(C_DOUBLE_COMPLEX),pointer :: src(:, :)
real(8),pointer :: tgt(:, :)

整数(C_INTPTR_T):: alloc_local,local_M,&
& local_L,local_offset1,local_offset2

integer :: ierr,id


call mpi_init(ierr)

call mpi_comm_rank(MPI_COMM_WORLD,id ,ierr)

call fftw_mpi_init()


alloc_local = fftw_mpi_local_size_2d(M,L / 2 + 1,MPI_COMM_WORLD,&
local_M,local_offset1 )

csrc = fftw_alloc_complex(alloc_local)
调用c_f_pointer(csrc,src,[L / 2,local_M])


alloc_local = fftw_mpi_local_size_2d( 2 *(L / 2 + 1),M,MPI_COMM_WORLD,&
& local_L,local_offset2)

ctgt = fftw_alloc_real(alloc_local)
call c_f_pointer(ctgt,tgt (M,L,src,tgt,MPI_COMM_WORLD,&
ior(FFTW_MEASURE,FFTW_MPI_TRANSPOSED_OUT))

调用fftw_mpi_execute_dft_c2r(plan,src,tgt)

调用mpi_finalize(ierr)


结束程序trashingfftw


答案是:

对于mpi实数变换,只有两种允许的换位和方向组合:


$ b

  • 实数到复数变换和FFTW_MPI_TRANSPOSED_OUT
  • 复数到实数变换和FFTW_MPI_TRANSPOSED_IN



我在挖掘fftw3版本的同时发现了这一点。 3.3.4代码,文件rdft2-problem.c,注释行120.

编辑:

<工作示例:

 程序trashingfftw 
use,intrinsic :: iso_c_binding
使用MPI

隐式无
包含'fftw3-mpi.f03'

整数(C_INTPTR_T),参数:: L = 256
integer(C_INTPTR_T),parameter :: M = 256

type(C_PTR):: plan,ctgt,csrc

complex(C_DOUBLE_COMPLEX),pointer :: src( :,:)
real(8),pointer :: tgt(:, :)

整数(C_INTPTR_T):: alloc_local,local_M,&
& local_L,local_offset1,local_offset2

integer :: ierr,id


call mpi_init(ierr)

call mpi_comm_rank(MPI_COMM_WORLD,id ,ierr)

调用fftw_mpi_init()


alloc_local = fftw_mpi_local_size_2d(L / 2 + 1,M,MPI_COMM_WORLD,&
local_l,local_offset1 )

print *,id,alloc complex =,alloc_local,local_l

csrc = fftw_alloc_complex(alloc_local)
call c_f_pointer(csrc,src,[M ,local_l])

!警告:必须根据复杂的布局划分真实的存储空间,这就是为什么
!我使用M和L / 2 + 1而不是M,2 *(L / 2 + 1),因为它在原始文章
alloc_local = fftw_mpi_local_size_2d(M,L / 2 + 1,MPI_COMM_WORLD,& ;
& local_M,local_offset2)

print *,id,alloc real =,alloc_local,local_m
!每个复数的两个实数
ctgt = fftw_alloc_real(2 * alloc_local)
!只有第一个L是相关的,其余的只是悬空(见fftw3 docs)
!caveat:因为填充是在第一个索引中,所以第二个数据是非连续布置的
!(L )
call c_f_pointer(ctgt,tgt,[2 *(L / 2 + 1),local_m])


plan = fftw_mpi_plan_dft_c2r_2d(M,L,src,tgt,MPI_COMM_WORLD,&
ior(FFTW_MEASURE,FFTW_MPI_TRANSPOSED_IN))

!应该是非空
print *,'plan:',计划

src(3,2)=(1.,0)
调用fftw_mpi_execute_dft_c2r(plan,src, tgt)

调用mpi_finalize(ierr)
结束程序thrashingfftw


Consider a 2D transform of the form L x M (column major setup), from a complex array src to a real array tgt. Or , in Fortranese,

complex(C_DOUBLE_COMPLEX), pointer :: src(:,:)
real(8), pointer :: tgt(:,:)  .

Corresponding pointers are

type(C_PTR) :: csrc,ctgt   .

I would allocate them in the following manner:

  ! The complex array first
    alloc_local = fftw_mpi_local_size_2d(M,L/2+1,MPI_COMM_WORLD,local_M,local_offset1)
    csrc = fftw_alloc_complex(alloc_local)
    call c_f_pointer(csrc, src, [L/2,local_M])

    ! Now the real array
    alloc_local = fftw_mpi_local_size_2d(2*(L/2+1),M, &
                                   MPI_COMM_WORLD,local_L,local_offset2)
    ctgt = fftw_alloc_real(alloc_local)
    call c_f_pointer(ctgt, tgt, [M,local_L])

Now, the plan would be created as:

! Create c-->r transform with one transposition left out
plan =  fftw_mpi_plan_dft_c2r_2d(M,L,src,tgt, MPI_COMM_WORLD, & 
                                           ior(FFTW_MEASURE,FFTW_MPI_TRANSPOSED_OUT))

Finally, the transform would be performed as:

call fftw_mpi_execute_dft_c2r(plan, src, tgt)

However, this prescription does not work. The last call causes a segmentation fault. At first, i thought this might have something to do with how I allocate src and tgt arrays, but playing with different amount of memory allocated to tgt did not give any result. So, I am either doing something really silly, or this is not possible to do at all.

EDIT : MINIMALISTIC COMPILEABLE EXAMPLE

program trashingfftw
  use, intrinsic :: iso_c_binding
  use MPI

  implicit none
  include 'fftw3-mpi.f03'

  integer(C_INTPTR_T), parameter :: L = 256
  integer(C_INTPTR_T), parameter :: M = 256

  type(C_PTR) :: plan, ctgt, csrc

  complex(C_DOUBLE_COMPLEX), pointer :: src(:,:)
  real(8), pointer :: tgt(:,:)

  integer(C_INTPTR_T) :: alloc_local, local_M, &
                         & local_L,local_offset1,local_offset2

  integer :: ierr,id


  call mpi_init(ierr)

  call mpi_comm_rank(MPI_COMM_WORLD,id,ierr)

  call fftw_mpi_init()


  alloc_local = fftw_mpi_local_size_2d(M,L/2+1, MPI_COMM_WORLD, &
       local_M, local_offset1)

  csrc = fftw_alloc_complex(alloc_local)
  call c_f_pointer(csrc, src, [L/2,local_M])


  alloc_local = fftw_mpi_local_size_2d(2*(L/2+1),M, MPI_COMM_WORLD, &
       &                               local_L, local_offset2)

  ctgt = fftw_alloc_real(alloc_local)
  call c_f_pointer(ctgt, tgt, [M,local_L])

  plan =  fftw_mpi_plan_dft_c2r_2d(M,L,src,tgt, MPI_COMM_WORLD, & 
       ior(FFTW_MEASURE, FFTW_MPI_TRANSPOSED_OUT))

  call fftw_mpi_execute_dft_c2r(plan, src, tgt)

  call mpi_finalize(ierr)


end program trashingfftw

解决方案

And the answer is:

For mpi real transforms, there are only two allowed combinations of transpositions and directions:

  • real to complex transform and FFTW_MPI_TRANSPOSED_OUT
  • complex to real transform and FFTW_MPI_TRANSPOSED_IN

I have found this while digging inside the fftw3 ver. 3.3.4 code, file "rdft2-problem.c", comment on the line 120.

EDIT:

MINIMAL COMPILABLE AND WORKING EXAMPLE:

program trashingfftw
  use, intrinsic :: iso_c_binding
  use MPI

  implicit none
  include 'fftw3-mpi.f03'

  integer(C_INTPTR_T), parameter :: L = 256
  integer(C_INTPTR_T), parameter :: M = 256

  type(C_PTR) :: plan, ctgt, csrc

  complex(C_DOUBLE_COMPLEX), pointer :: src(:,:)
  real(8), pointer :: tgt(:,:)

  integer(C_INTPTR_T) :: alloc_local, local_M, &
                         & local_L,local_offset1,local_offset2

  integer :: ierr,id


  call mpi_init(ierr)

  call mpi_comm_rank(MPI_COMM_WORLD,id,ierr)

  call fftw_mpi_init()


  alloc_local = fftw_mpi_local_size_2d(L/2+1,M, MPI_COMM_WORLD, &
       local_l, local_offset1)

  print *, id, "alloc complex=",alloc_local, local_l

  csrc = fftw_alloc_complex(alloc_local)
  call c_f_pointer(csrc, src, [M,local_l])

  !Caveat: Must partition the real storage according to complex layout, this is why
  ! I am using M and L/2+1 instead of M, 2*(L/2+1) as it was done in the original post
  alloc_local = fftw_mpi_local_size_2d(M,L/2+1, MPI_COMM_WORLD, &
      &                               local_M, local_offset2)

  print *, id, "alloc real=",alloc_local, local_m
  ! Two reals per complex
  ctgt = fftw_alloc_real(2*alloc_local)
  ! Only the first L are relevant, the rest is just dangling space (see fftw3 docs) 
  !caveat: since the padding is in the first index, the 2d data is laid out non-contiguously 
  !(L sensible reals, padding, padding, L sensible reals, padding, padding, ....)
  call c_f_pointer(ctgt, tgt, [2*(L/2+1),local_m])


  plan =  fftw_mpi_plan_dft_c2r_2d(M,L,src,tgt, MPI_COMM_WORLD, & 
       ior(FFTW_MEASURE, FFTW_MPI_TRANSPOSED_IN))

  ! Should be non-null
  print *, 'plan:', plan

  src(3,2)=(1.,0)
  call fftw_mpi_execute_dft_c2r(plan, src, tgt) 

  call mpi_finalize(ierr)
end program thrashingfftw

这篇关于如何做一个fftw3 MPI“转置”如果可能的话,2D变换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆