在Java lambda中使用两个流来计算协方差 [英] Using two streams in Java lambda to compute covariance

查看:236
本文介绍了在Java lambda中使用两个流来计算协方差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有两个double数组。
我一直在尝试使用Java 8中的Stream。我想我已经理解了主要的想法
然后我意识到我不知道如何同时操作两个Streams。

Let's say I have two arrays of double. I've been experimenting with Stream from Java 8. I think I've understood the main ideas but then I realised that I'm not sure how to manipulate two Streams at the same time.

例如,我想计算两个数组的协方差。

For example, I want to calculate the covariance of both arrays.

public class foo {

    public static double mean(double[] xs) {
        return Arrays.stream(xs).average().getAsDouble();
}

    public static void main(String[] args) {
        double[] xs = {1, 2, 3, 4, 5, 6, 7, 8, 9};
        double[] ys = {1517.93, 1757.78, 1981.1, 2215.73, 2942.66, 3558.32, 4063.91, 4521.16, 5101.76, 5234.12};

        System.out.println("Mean of xs: " + mean(xs));
        double xs_sumDeviation = Arrays.stream(xs)
            .boxed()
            .mapToDouble(d -> d.doubleValue() - mean(xs))
            .sum();
       // Covariance
        double covXY = Arrays.stream(xs, ys)
            .mapToDouble(x,y -> {
                  double numerator = (x-mean(xs)* (y-mean(ys);
                  double denominator = Math.sqrt((x-mean(xs)* (x-mean(xs));
                  return numerator / denominator;
             })
            .sum();

    }
}

感谢您的建议。

尝试1。

public static double covariance(double[] xs, double[] ys) {
    double xmean = mean(xs);
    double ymean = mean(ys);
    double numerator = IntStream.range(0, Math.min(xs.length, ys.length))
            .parallel()
            .mapToDouble(i -> (xs[i] - xmean) * (ys[i] - ymean))
            .sum();
    double denominator = Math.sqrt(IntStream.range(0, xs.length)
            .parallel()
            .mapToDouble(i -> (xs[i] - xmean) * (xs[i] - xmean))
            .sum());
    return numerator / denominator;


推荐答案

在其他编程语言中,有某种 zip 函数,它需要几个 iterables ,并返回一个迭代器,它聚合来自每个 iterables 的元素。例如,请参阅Python库中的 zip 函数。

In other programming languages, there is some kind of zip function, that takes several iterables, and returns an iterator that aggregates elements from each of the iterables. See for example the function zip in the Python Library.

虽然可以在Java中创建类似的函数,但很难以这种方式实现它,它支持高效的并行执行。但是,Java中有一种常用的模式,有点不同。在您的情况下,它可能如下所示:

Although it would be possible to make a similar function in Java, it's hard to implement it in such a way, that it supports efficient parallel execution. However, there is a commonly used pattern in Java, that is a bit different. In your case, it might look as follows:

public static double covariance(double[] xs, double[] ys) {
    double xmean = mean(xs);
    double ymean = mean(ys);
    return IntStream.range(0, Math.min(xs.length, ys.length))
        .parallel()
        .mapToDouble(i -> {
                double numerator = (xs[i] - xmean) * (ys[i] - ymean);
                double denominator = ...;
                return numerator / denominator;
            })
        .sum();
}

您可以创建一个 IntStream,而不是组合两个流包含所有索引,并且您可以通过索引访问不同集合的元素。只要集合支持随机访问操作,这种方法就可以正常工作。

Instead of combining two streams, you create an IntStream with all indexes, and you access the elements of the different collections by index. That works pretty well as long as the collections support random access operations.

这篇关于在Java lambda中使用两个流来计算协方差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆