如何获得Fortran的SUM命令结果超过2 ^ 24单precision阵列 [英] How to get the Fortran SUM command result to exceed 2^24 for single precision arrays
问题描述
要检查内存分配,我们填充单precision阵列,单位值,并与 SUM
和 DOT_PRODUCT
命令。这些内在停止16777216(= 2 ^ 24)后计数。我们怎样才能获得这些命令数十亿的元素?我们preFER避免 DO
循环。这是不是与高precision数组的一个问题。
节目分配器 使用iso_fortran_env
隐NONE 整数参数:: SP = selected_real_kind(REAL32)
整数参数:: xlint = selected_int_kind(INT64) 整数(xlint):: N =亿
实(SP)分配的,尺寸(:)::数组 整数(xlint):: alloc_status = 0
字符(LEN = 255):: alloc_msg =!分配
分配(阵列(1:N),STAT = alloc_status,ERRMSG = alloc_msg)
如果(alloc_status / = 0)打印*,在分配错误',N元素:STAT =',alloc_status,ERRMSG =',alloc_msg!填充
阵列= 1.0_sp
写(*'(分配的元素个数=G0)')N
写(*,'(,G0)元素=总和')和(阵列)
写(*'(积=G0,/)')DOT_PRODUCT(数组,数组)! DEALLOCATE
DEALLOCATE(数组,STAT = alloc_status,ERRMSG = alloc_msg)
如果(alloc_status / = 0)打印*,'上释放错误',N'元素:STAT =',alloc_status,ERRMSG =',alloc_msg 写(*'(编译器版本=,A)')compiler_version()
写(*'(编译器选项=,A)')修剪(compiler_options())最终分配方案
输出:
元素的分配数量=亿
元素= 16777216的总和。
积= 16777216。编译器版本=版本的GCC 4.6.2 20111019($ P $租赁前)
编译器选项-fPIC = -mmacosx版本分钟= 10.6.8 -mtune = core2的
这是由于单precision实数有限的precision ...
既然你只有你显著数字24位,你的分辨率为1/2 ** 24 =16777216分之1。换句话说,你不能解析加一百六十七万七千七百二十一分之一为1,或在您的案件。
16777216 + 1 = 16777216
要能够解决这个操作是必需的,对于之
和 DOT_PRODUCT
(即使计算使用简单循环),就需要(至少)precision的另一个位:
节目分配器 使用iso_fortran_env
隐NONE 整数参数:: SP = REAL32
整数参数:: xlint = INT64 整数(xlint):: N =亿
实(SP)分配的,尺寸(:)::数组
实(REAL64)::小号
整数(xlint)::我 整数(xlint):: alloc_status = 0
字符(LEN = 255):: alloc_msg =!分配
分配(阵列(1:N),STAT = alloc_status,ERRMSG = alloc_msg)
如果(alloc_status / = 0)打印*,在分配错误',N元素:STAT =',alloc_status,ERRMSG =',alloc_msg!填充
阵列= 1.0_sp
写(*'(分配的元素个数=G0)')N
写(*,'(,G0)元素=总和')和(阵列)
写(*'(积=G0,/)')DOT_PRODUCT(数组,数组) !使用双precision浮动计算总和
S =实(阵列(1),REAL64)
做我= 2,N
S = S +真实(阵列(i)中,REAL64)
ENDDO!一世
写(*,'(,G0)元素=之和)■
!计算使用双precision浮动的积
S =实(阵列(1),REAL64)** 2
做我= 2,N
S = S +真实(阵列(i)中,REAL64)** 2
ENDDO!一世
写(*'(积=G0,/))■! DEALLOCATE
DEALLOCATE(数组,STAT = alloc_status,ERRMSG = alloc_msg)
如果(alloc_status / = 0)打印*,'上释放错误',N'元素:STAT =',alloc_status,ERRMSG =',alloc_msg 写(*'(编译器版本=,A)')compiler_version()
写(*'(编译器选项=,A)')修剪(compiler_options())最终分配方案
输出:
元素的分配数量=亿
元素的总和= 16777216.0
积= 16777216.0元素= 100000000.00000000的总和
积= 100000000.00000000编译器版本=版本的GCC 4.8.4 20140605($ P $租赁前)
编译器选项= -cpp -i preFIX /home/elias/opt/sde/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.4/ -mtune =通用-march = x86的64 -O0 -Wall -Wextra
To check memory allocations we populate single precision arrays with unit values and interrogate with the SUM
and DOT_PRODUCT
commands. These intrinsics stop counting after 16777216 (= 2^24). How can we get these commands to count billions of elements? We prefer to avoid DO
loops. This is not a problem with higher precision arrays.
program allocator
use iso_fortran_env
implicit NONE
integer, parameter :: sp = selected_real_kind ( REAL32 )
integer, parameter :: xlint = selected_int_kind ( INT64 )
integer ( xlint ) :: n = 100000000
real ( sp ), allocatable, dimension ( : ) :: array
integer ( xlint ) :: alloc_status = 0
character ( len = 255 ) :: alloc_msg = ""
! ALLOCATE
allocate ( array ( 1 : n ), stat = alloc_status, errmsg = alloc_msg )
if ( alloc_status /= 0 ) print *, 'allocation error on ', n, ' elements: stat = ', alloc_status, ', errmsg = ', alloc_msg
! POPULATE
array = 1.0_sp
write ( *, '( "number of elements allocated = ", g0 )' ) n
write ( *, '( "sum of elements = ", g0 )' ) sum ( array )
write ( *, '( "dot product = ", g0, / )' ) dot_product ( array, array )
! DEALLOCATE
deallocate ( array, stat = alloc_status, errmsg = alloc_msg )
if ( alloc_status /= 0 ) print *, 'deallocation error on ', n, ' elements: stat = ', alloc_status, ', errmsg = ', alloc_msg
write ( *, '( "compiler version = ", A )' ) compiler_version()
write ( *, '( "compiler options = ", A )' ) trim ( compiler_options() )
end program allocator
Output:
number of elements allocated = 100000000
sum of elements = 16777216.
dot product = 16777216.
compiler version = GCC version 4.6.2 20111019 (prerelease)
compiler options = -fPIC -mmacosx-version-min=10.6.8 -mtune=core2
That's due to the limited precision with single precision reals...
Since you only have 24 bits for your significant "digits", your resolution is 1/2**24 = 1/16777216. In other words, you cannot resolve an addition of 1/1677721 to 1, or in your case
16777216 + 1 = 16777216
To be able to resolve this operations which is required for both sum
and dot_product
(even if calculated using simple loops), you would need (at least) another bit of precision:
program allocator
use iso_fortran_env
implicit NONE
integer, parameter :: sp = REAL32
integer, parameter :: xlint = INT64
integer ( xlint ) :: n = 100000000
real ( sp ), allocatable, dimension ( : ) :: array
real ( REAL64 ) :: s
integer ( xlint ) :: i
integer ( xlint ) :: alloc_status = 0
character ( len = 255 ) :: alloc_msg = ""
! ALLOCATE
allocate ( array ( 1 : n ), stat = alloc_status, errmsg = alloc_msg )
if ( alloc_status /= 0 ) print *, 'allocation error on ', n, ' elements: stat = ', alloc_status, ', errmsg = ', alloc_msg
! POPULATE
array = 1.0_sp
write ( *, '( "number of elements allocated = ", g0 )' ) n
write ( *, '( "sum of elements = ", g0 )' ) sum ( array )
write ( *, '( "dot product = ", g0, / )' ) dot_product ( array, array )
! Calculate the sum using a double precision float
s = real( array(1), REAL64 )
do i=2,n
s = s + real( array(i), REAL64 )
enddo ! i
write ( *, '( "sum of elements = ", g0 )' ) s
! Calculate the dot product using a double precision float
s = real( array(1), REAL64 )**2
do i=2,n
s = s + real( array(i), REAL64 )**2
enddo ! i
write ( *, '( "dot product = ", g0, / )' ) s
! DEALLOCATE
deallocate ( array, stat = alloc_status, errmsg = alloc_msg )
if ( alloc_status /= 0 ) print *, 'deallocation error on ', n, ' elements: stat = ', alloc_status, ', errmsg = ', alloc_msg
write ( *, '( "compiler version = ", A )' ) compiler_version()
write ( *, '( "compiler options = ", A )' ) trim ( compiler_options() )
end program allocator
Output:
number of elements allocated = 100000000
sum of elements = 16777216.0
dot product = 16777216.0
sum of elements = 100000000.00000000
dot product = 100000000.00000000
compiler version = GCC version 4.8.4 20140605 (prerelease)
compiler options = -cpp -iprefix /home/elias/opt/sde/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.4/ -mtune=generic -march=x86-64 -O0 -Wall -Wextra
这篇关于如何获得Fortran的SUM命令结果超过2 ^ 24单precision阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!