将双精度四舍五入到单精度:强制上限 [英] Rounding of double precision to single precision: Forcing an upper bound

查看:161
本文介绍了将双精度四舍五入到单精度:强制上限的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Mersenne Twister实现,它为我提供了双精度数字.

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/FORTRAN/fortran.html (多田刚义在Fortran 77中实现,我使用的是genrand_real2)

但是,我的应用程序为了避免在将具有不同精度的数字相乘时出现警告,需要一个单个精度的随机数. 因此,我编写了一个小函数来在两种数据类型之间进行转换:

    function genrand_real()

    real   genrand_real
    real*8 genrand_real2

    genrand_real = real(genrand_real2())

    return
    end

我正在使用real和real * 8来与我正在处理的代码保持一致. 它在大多数情况下都能正常运行(事实上,我不确定real()的速度如何),但是它更改了我的RNG的上限,因为转换将[0,1)更改为[0, 1].在遇到问题之前,我从未想过.

我的问题是,如何才能有效地确保上限,或者我该如何编写类似于genrand_real2(原始函数)的函数,该函数为我提供单精度实数.我的猜测是我只需要替换除数4294967296.d0,但我不知道用哪个数字

  function genrand_real2()

  double precision genrand_real2,r
  integer genrand_int32
  r=dble(genrand_int32())
  if(r.lt.0.d0)r=r+2.d0**32
  genrand_real2=r/4294967296.d0

  return
  end

解决方案

您发布的函数不会生成随机数,它只会将随机整数(从genrand_int32()起)限制为区间[0,1),并除以2 ^ 32(恰好是4294967296),如果int为负数,则先加2 ^ 32. 2 ^ 32是标准整数可以容纳的值的数量,负半数,正半数(大约在正数末尾缺少1),因此来自函数genrand_int32().

想象一下,您有一个从-10到10的数字,并且想要将它们限制为[0,1].最简单的解决方案是在负数上加上20(因此,正数保持0-10,负数变为10-20),然后除以20. 这正是函数要执行的操作,仅使用2 ^ 31而不是10.

如果您想知道函数的间隔为何为[0,1): 由于数字0也需要一个点,并且位表示形式只能存储2 ^ 32个数字,因此不能有2 ^ 31个负数和2 ^ 31个正数AND0.解决方案是忽略值+ 2 ^ 31(最高正数),因此从您的时间间隔中排除了1.

因此,将整个问题简化为单精度:

function genrand_real2()

real genrand_real2,r
integer genrand_int32
r=real(genrand_int32())
if(r.lt.0)r=r+2**32
genrand_real2=r/4294967296

return
end

魔术数字必须保持不变,因为它们与整数相关,而不是与实数相关.

修改: 您已经说过了,所以我只想对其他人重复一遍:对于可移植性,从技术上讲,不指定精度而不使用默认类型不是一个好主意.因此,您应该在某处执行sp = selected_real_kind(6, 37)(对于单精度,请使用sp),然后执行real(kind=sp)...2.0_sp等. 但是,这更多是学术上的观点.

I'm using a Mersenne Twister implementation which provides me numbers with double precision.

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/FORTRAN/fortran.html (implementation in Fortran 77 by Tsuyoshi Tada, I'm using genrand_real2)

However, my application needs, in order to avoid warnings while multiplying numbers with different precisions, a single precision random number. So, I wrote a small function to convert between the two data types:

    function genrand_real()

    real   genrand_real
    real*8 genrand_real2

    genrand_real = real(genrand_real2())

    return
    end

I'm using real and real*8 to be consistent with the code I'm working on. It works perfectly most of the time (besides de fact that I'm not sure about how fast real() is), however it changes the upper bound of my RNG, since the conversion changes the [0,1) to [0,1]. I've never thought about that until I've had problems with it.

My question is, how can I ensure the upper bound in an efficient way, or even how could I write a function similar to genrand_real2 (the original one) that provides me single precision reals. My guess is I only need to replace the divisor 4294967296.d0 but I don't know by which number

  function genrand_real2()

  double precision genrand_real2,r
  integer genrand_int32
  r=dble(genrand_int32())
  if(r.lt.0.d0)r=r+2.d0**32
  genrand_real2=r/4294967296.d0

  return
  end

解决方案

The function you posted does NOT generate random numbers, it only limits random integers (from genrand_int32()) to the interval [0,1) by dividing by 2^32 (which is exactly 4294967296) or adding 2^32 first if the int is negative. 2^32 is the number of values that a standard integer can hold, one half negative, one half positive (approximately, there is 1 missing at the positive end) and therefore comes from the function genrand_int32().

Imagine you had numbers from -10 to 10 and wanted to restrict them to the interval [0,1]. The easiest solution is to add 20 to the negative numbers (so positive stay 0-10 and negative become 10-20) and then divide by 20. That's exactly what the function is doing, just with 2^31 instead of 10.

If you are wondering why the interval for your function is [0, 1): Since the number 0 also needs a spot and the bit-representation can only store 2^32 numbers, you can't have 2^31 negative and 2^31 positive numbers AND 0. The solution is to leave out the value +2^31 (highest positive one) and consequently 1 is excluded from your interval.

So to bring the whole thing down to single-precission:

function genrand_real2()

real genrand_real2,r
integer genrand_int32
r=real(genrand_int32())
if(r.lt.0)r=r+2**32
genrand_real2=r/4294967296

return
end

The magic numbers have to stay the same, because they relate to the integers, not the reals.

Edit: You already said it yourself, so I'm just repeating for other people: For portability it is technically not a good idea to use default types without specifying the precision. So you should do sp = selected_real_kind(6, 37) (sp for single precision) somewhere and then real(kind=sp)... and 2.0_sp and so forth. However, this is more of an academic point.

这篇关于将双精度四舍五入到单精度:强制上限的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆