从存储为F#的概率序列的离散分布函数中提取随机数 [英] Drawing a random number from a discrete distribution function stored as a sequence of probabilities in F#

查看:115
本文介绍了从存储为F#的概率序列的离散分布函数中提取随机数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个有限长度N的给定浮点序列(0到1之间),它表示整数0..N-1上的分布函数.我们正在尝试从此分布中提取一个随机数.一种实现方法是在[0,1](浮点数)中绘制一个统一的随机变量,然后计算该数字的逆累积分布函数.

There is a given sequence of floats (between 0 and 1) of finite length N which denotes a distribution function over integers 0..N-1. We are trying to draw a random number from this distribution. One way of doing it is to draw a uniform random variable in [0, 1] (float), and then calculate the inverse cumulative distribution function for that number.

如果分发位于数组中,则代码将如下所示:

If the distribution was in an array, the code would look something like this:

let matched distribution draw =
  let rec matchRest distribution draw start = 
    if start = Array.length distribution then start-1
    else
      let left = draw - distribution.[start]
      if left <= 0 then start
      else matchRest distribution left (start+1)
  matchRest distribution draw 0

其中distribution是分布函数,draw是统一的[0,1]数字.

where distribution is the distribution function and draw is the uniform [0,1] number.

当分配为任意序列时,如何重写此功能以起作用?显然我可以创建一个临时数组,但这似乎不是一个很好的解决方案...

How can I rewrite this function to work when distribution is any sequence? Obviously I could create a temporary array, but it doesn't seem to be an elegant solution...

推荐答案

您的matched函数是一个搜索过程.您无需分解顺序即可搜索适当的索引; Seq模块中的高阶函数可以帮助您:

Your matched function is a search procedure. You don't have to break down the sequence to search for an appropriate index; high-order functions from Seq module could help you:

/// Taken from http://missingfaktor.blogspot.in/2012/02/f-option-cheat-sheet.html
let getOrElse d = function
    | Some a -> a
    | None -> d

let matched distribution draw =
    distribution 
    |> Seq.scan (fun (s, d0) d -> (s+1, d0-d)) (0, draw)
    |> Seq.tryPick (fun (s, d) -> if d <= 0 then Some s else None)
    |> getOrElse (Seq.length distribution-1)

在最坏的情况下,您仍然可以通过Seq.length枚举整个序列.我认为最好将Seq.length distribution-1更改为已知常数.

In the worst case, you still enumerate the whole sequence via Seq.length. I think it's better to change Seq.length distribution-1 to a known constant if possible.

这篇关于从存储为F#的概率序列的离散分布函数中提取随机数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆