如何获得Fortran的SUM命令结果超过2 ^ 24单precision阵列 [英] How to get the Fortran SUM command result to exceed 2^24 for single precision arrays

查看:313
本文介绍了如何获得Fortran的SUM命令结果超过2 ^ 24单precision阵列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要检查内存分配,我们填充单precision阵列,单位值,并与 SUM DOT_PRODUCT 命令。这些内在停止16777216(= 2 ^ 24)后计数。我们怎样才能获得这些命令数十亿的元素?我们preFER避免 DO 循环。这是不是与高precision数组的一个问题。

 节目分配器  使用iso_fortran_env
  隐NONE  整数参数:: SP = selected_real_kind(REAL32)
  整数参数:: xlint = selected_int_kind(INT64)  整数(xlint):: N =亿
  实(SP)分配的,尺寸(:)::数组  整数(xlint):: alloc_status = 0
  字符(LEN = 255):: alloc_msg =!分配
    分配(阵列(1:N),STAT = alloc_status,ERRMSG = alloc_msg)
    如果(alloc_status / = 0)打印*,在分配错误',N元素:STAT =',alloc_status,ERRMSG =',alloc_msg!填充
    阵列= 1.0_sp
    写(*'(分配的元素个数=G0)')N
    写(*,'(,G0)元素=总和')和(阵列)
    写(*'(积=G0,/)')DOT_PRODUCT(数组,数组)! DEALLOCATE
    DEALLOCATE(数组,STAT = alloc_status,ERRMSG = alloc_msg)
    如果(alloc_status / = 0)打印*,'上释放错误',N'元素:STAT =',alloc_status,ERRMSG =',alloc_msg    写(*'(编译器版本=,A)')compiler_version()
    写(*'(编译器选项=,A)')修剪(compiler_options())最终分配方案

输出:

 元素的分配数量=亿
元素= 16777216的总和。
积= 16777216。编译器版本=版本的GCC 4.6.2 20111019($ P $租赁前)
编译器选项-fPIC = -mmacosx版本分钟= 10.6.8 -mtune = core2的


解决方案

这是由于单precision实数有限的precision ...

既然你只有你显著数字24位,你的分辨率为1/2 ** 24 =16777216分之1。换句话说,你不能解析加一百六十七万七千七百二十一分之一为1,或在您的案件。

  16777216 + 1 = 16777216

要能够解决这个操作是必需的,对于 DOT_PRODUCT (即使计算使用简单循环),就需要(至少)precision的另一个位:

 节目分配器  使用iso_fortran_env
  隐NONE  整数参数:: SP = REAL32
  整数参数:: xlint = INT64  整数(xlint):: N =亿
  实(SP)分配的,尺寸(:)::数组
  实(REAL64)::小号
  整数(xlint)::我  整数(xlint):: alloc_status = 0
  字符(LEN = 255):: alloc_msg =!分配
    分配(阵列(1:N),STAT = alloc_status,ERRMSG = alloc_msg)
    如果(alloc_status / = 0)打印*,在分配错误',N元素:STAT =',alloc_status,ERRMSG =',alloc_msg!填充
    阵列= 1.0_sp
    写(*'(分配的元素个数=G0)')N
    写(*,'(,G0)元素=总和')和(阵列)
    写(*'(积=G0,/)')DOT_PRODUCT(数组,数组)    !使用双precision浮动计算总和
    S =实(阵列(1),REAL64)
    做我= 2,N
      S = S +真实(阵列(i)中,REAL64)
    ENDDO!一世
    写(*,'(,G0)元素=之和)■
    !计算使用双precision浮动的积
    S =实(阵列(1),REAL64)** 2
    做我= 2,N
      S = S +真实(阵列(i)中,REAL64)** 2
    ENDDO!一世
    写(*'(积=G0,/))■! DEALLOCATE
    DEALLOCATE(数组,STAT = alloc_status,ERRMSG = alloc_msg)
    如果(alloc_status / = 0)打印*,'上释放错误',N'元素:STAT =',alloc_status,ERRMSG =',alloc_msg    写(*'(编译器版本=,A)')compiler_version()
    写(*'(编译器选项=,A)')修剪(compiler_options())最终分配方案

输出:

 元素的分配数量=亿
元素的总和= 16777216.0
积= 16777216.0元素= 100000000.00000000的总和
积= 100000000.00000000编译器版本=版本的GCC 4.8.4 20140605($ P $租赁前)
编译器选项= -cpp -i preFIX /home/elias/opt/sde/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.4/ -mtune =通用-march = x86的64 -O0 -Wall -Wextra

To check memory allocations we populate single precision arrays with unit values and interrogate with the SUM and DOT_PRODUCT commands. These intrinsics stop counting after 16777216 (= 2^24). How can we get these commands to count billions of elements? We prefer to avoid DO loops. This is not a problem with higher precision arrays.

program allocator

  use iso_fortran_env
  implicit NONE

  integer, parameter :: sp    = selected_real_kind ( REAL32 )
  integer, parameter :: xlint = selected_int_kind  ( INT64 )

  integer ( xlint )                            :: n = 100000000
  real    ( sp ), allocatable, dimension ( : ) :: array

  integer   ( xlint )     :: alloc_status = 0
  character ( len = 255 ) :: alloc_msg = ""

!   ALLOCATE
    allocate ( array ( 1 : n ), stat = alloc_status, errmsg = alloc_msg )
    if ( alloc_status /= 0 ) print *, 'allocation error on ', n, ' elements: stat = ', alloc_status, ', errmsg = ', alloc_msg

!   POPULATE
    array = 1.0_sp
    write ( *, '( "number of elements allocated = ", g0 )' )    n
    write ( *, '( "sum of elements              = ", g0 )' )    sum ( array )
    write ( *, '( "dot product                  = ", g0, / )' ) dot_product ( array, array )

!   DEALLOCATE
    deallocate ( array, stat = alloc_status, errmsg = alloc_msg )
    if ( alloc_status /= 0 ) print *, 'deallocation error on ', n, ' elements: stat = ', alloc_status, ', errmsg = ', alloc_msg

    write ( *, '( "compiler version = ", A )' ) compiler_version()
    write ( *, '( "compiler options = ", A )' ) trim ( compiler_options() )

end program allocator

Output:

number of elements allocated = 100000000
sum of elements              = 16777216.
dot product                  = 16777216.

compiler version = GCC version 4.6.2 20111019 (prerelease)
compiler options = -fPIC -mmacosx-version-min=10.6.8 -mtune=core2

解决方案

That's due to the limited precision with single precision reals...

Since you only have 24 bits for your significant "digits", your resolution is 1/2**24 = 1/16777216. In other words, you cannot resolve an addition of 1/1677721 to 1, or in your case

16777216 + 1 = 16777216

To be able to resolve this operations which is required for both sum and dot_product (even if calculated using simple loops), you would need (at least) another bit of precision:

program allocator

  use iso_fortran_env
  implicit NONE

  integer, parameter :: sp    = REAL32
  integer, parameter :: xlint = INT64

  integer ( xlint )                            :: n = 100000000
  real    ( sp ), allocatable, dimension ( : ) :: array
  real    ( REAL64 )                           :: s
  integer ( xlint )                            :: i

  integer   ( xlint )     :: alloc_status = 0
  character ( len = 255 ) :: alloc_msg = ""

!   ALLOCATE
    allocate ( array ( 1 : n ), stat = alloc_status, errmsg = alloc_msg )
    if ( alloc_status /= 0 ) print *, 'allocation error on ', n, ' elements: stat = ', alloc_status, ', errmsg = ', alloc_msg

!   POPULATE
    array = 1.0_sp
    write ( *, '( "number of elements allocated = ", g0 )' )    n
    write ( *, '( "sum of elements              = ", g0 )' )    sum ( array )
    write ( *, '( "dot product                  = ", g0, / )' ) dot_product ( array, array )

    ! Calculate the sum using a double precision float
    s = real( array(1), REAL64 )
    do i=2,n
      s = s + real( array(i), REAL64 )
    enddo ! i
    write ( *, '( "sum of elements              = ", g0 )' )    s
    ! Calculate the dot product using a double precision float
    s = real( array(1), REAL64 )**2
    do i=2,n
      s = s + real( array(i), REAL64 )**2
    enddo ! i
    write ( *, '( "dot product                  = ", g0, / )' ) s

!   DEALLOCATE
    deallocate ( array, stat = alloc_status, errmsg = alloc_msg )
    if ( alloc_status /= 0 ) print *, 'deallocation error on ', n, ' elements: stat = ', alloc_status, ', errmsg = ', alloc_msg

    write ( *, '( "compiler version = ", A )' ) compiler_version()
    write ( *, '( "compiler options = ", A )' ) trim ( compiler_options() )

end program allocator

Output:

number of elements allocated = 100000000
sum of elements              = 16777216.0
dot product                  = 16777216.0

sum of elements              = 100000000.00000000
dot product                  = 100000000.00000000

compiler version = GCC version 4.8.4 20140605 (prerelease)
compiler options = -cpp -iprefix /home/elias/opt/sde/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.4/ -mtune=generic -march=x86-64 -O0 -Wall -Wextra

这篇关于如何获得Fortran的SUM命令结果超过2 ^ 24单precision阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆