用Java替换加权采样 [英] Weighted sampling with replacement in Java

查看:60
本文介绍了用Java替换加权采样的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Java或类似Apache Commons Math的库中是否存在与MATLAB函数等效的函数 randsample ? 更具体地说,我想找到一个函数randSample,该函数根据我指定的概率分布返回一个独立且完全相同的随机变量的向量. 例如:

Is there a function in Java, or in a library such as Apache Commons Math which is equivalent to the MATLAB function randsample? More specifically, I want to find a function randSample which returns a vector of Independent and Identically Distributed random variables according to the probability distribution which I specify. For example:

int[] a = randSample(new int[]{0, 1, 2}, 5, new double[]{0.2, 0.3, 0.5})
//        { 0 w.p. 0.2
// a[i] = { 1 w.p. 0.3
//        { 2 w.p. 0.5

输出与MATLAB代码randsample([0 1 2], 5, true, [0.2 0.3 0.5])相同,其中true表示要进行替换采样.

The output is the same as the MATLAB code randsample([0 1 2], 5, true, [0.2 0.3 0.5]) where the true means sampling with replacement.

如果不存在这样的功能,该怎么写?

If such a function does not exist, how do I write one?

注意::我知道有一个类似的问题已被问到堆栈溢出问题,但不幸的是它没有得到回答.

Note: I know that a similar question has been asked on Stack Overflow but unfortunately it has not been answered.

推荐答案

我敢肯定一个人不存在,但是创建一个可以产生这样的样本的函数很容易.首先,Java确实提供了一个随机数生成器,特别是带有Random.nextDouble()函数的函数,该函数可以生成0.0到1.0之间的随机双精度数.

I'm pretty sure one doesn't exist, but it's pretty easy to make a function that would produce samples like that. First off, Java does come with a random number generator, specifically one with a function, Random.nextDouble() that can produce random doubles between 0.0 and 1.0.

import java.util.Random;

double someRandomDouble = Random.nextDouble();
     // This will be a uniformly distributed
     // random variable between 0.0 and 1.0.

如果要进行替换抽样,如果将输入的pdf转换为cdf,则可以使用Java提供的随机双精度数,通过查看CDf属于哪个部分来创建随机数据集.因此,首先您需要将pdf转换为cdf.

If you have sampling with replacement, if you convert the pdf you have as an input into a cdf, you can use the random doubles Java provides to create a random data set by seeing in which part of the cdf it falls. So first you need to convert the pdf into a cdf.

int [] randsample(int[] values, int numsamples, 
        boolean withReplacement, double [] pdf) {

    if(withReplacement) {
        double[] cdf = new double[pdf.length];
        cdf[0] = pdf[0];
        for(int i=1; i<pdf.length; i++) {
            cdf[i] = cdf[i-1] + pdf[i];
        }

然后,您将适当大小的整数数组存储起来并开始查找随机结果:

Then you make the properly-sized array of ints to store the result and start finding the random results:

        int[] results = new int[numsamples];
        for(int i=0; i<numsamples; i++) {
            int currentPosition = 0;

            while(randomValue > cdf[currentPosition] && currentPosition < cdf.length) {
                currentPosition++; //Check the next one.
            }

            if(currentPosition < cdf.length) { //It worked!
                results[i] = values[currentPosition];
            } else { //It didn't work.. let's fail gracefully I guess.
                results[i] = values[cdf.length-1]; 
                     // And assign it the last value.
            }
        }

        //Now we're done and can return the results!
        return results;
    } else { //Without replacement.
        throw new Exception("This is unimplemented!");
    }
}

有一些错误检查(确保值数组和pdf数组的大小相同)和一些其他功能,可以通过重载此功能以提供其他功能来实现,但希望这足以让您开始.干杯!

There's some error checking (make sure value array and pdf array are the same size) and some other features you can implement by overloading this to provide the other functions, but hopefully this is enough for you to start. Cheers!

这篇关于用Java替换加权采样的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆