使用reduce和collect找到平均值 [英] Finding average using reduce and collect

查看:168
本文介绍了使用reduce和collect找到平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解新的Java 8 Stream API。



http://docs.oracle.com/javase/tutorial/collections/streams/reduction.html



我找到了使用collect API查找数字的平均值的例子。但我觉得,使用reduce()也可以做到这一点。

  public class Test {

public static void main(String [] args){
//使用collect
System.out.println(Stream.of(1,2,3,4,5,6,7,8 ,9,10)
.collect(Averager :: new,Averager :: accept,Averager :: combine)
.average());

//使用reduce
System.out.println(Stream.of(1,2,3,4,5,6,7,8,9,10)
(t,u) - > {
t.accept(u);
return t;
},(t,u) - > {
t.combine(u);
return t;
})。average());
}

private static class Averager {
private int total = 0;
private int count = 0;

public Averager(){
// System.out.println(Creating averager);


public double average(){
// System.out.println(Finding average);
退货次数> 0? ((双)总数)/ count:0;
}

public void accept(int i){
// System.out.println(Accepting+ i);
total + = i;
count ++;
}

public void combine(Averager other){
// System.out.println(合并平均器:+其他);
total + = other.total;
count + = other.count;

$ b @Override
public String toString(){
return[total:+ total +,count:+ count +];





$ 1 b)是否有任何理由,我应该使用collect而不是reduce在这里?

2)如果我启用了所有的调试系统,我可以看到执行的操作在完全相同,收集和减少之间。在这两种情况下,组合器都没有被使用。

3)如果我使这些流平行,那么collect一直会返回正确的结果。 reduce()每次给我不同的结果。

4)我不应该使用reduce,并行流吗?

谢谢, >
保罗

解决方案

reduce collect collect 是可以并行处理可变对象的增强形式。 collect 算法线程限制各种结果对象,以便它们可以安全地进行变异,即使它们不是线程安全的。这就是为什么 Averager 使用 collect 的原因。对于使用 reduce 的顺序计算,这通常不重要,但对于并行计算,它会给出不正确的结果,正如您观察到的那样。



关键的一点是,只要它处理而不是可变对象, reduce 就可以工作。您可以通过查看 reduce 的第一个参数来查看。示例代码传递了 new Averager(),它是一个单个对象,它被并行缩减中的多个线程用作标识值。并行流的工作方式是将工作负载分解成由单独线程处理的段。如果多个线程正在改变相同的(非线程安全的)对象,应该清楚为什么这会导致不正确的结果。



可以使用 reduce 来计算平均值,但是您需要使您的累积对象成为不可变的。考虑一个对象 ImmutableAverager

 静态类ImmutableAverager {
私人总决赛;
private final int count;

public ImmutableAverager(){
this.total = 0;
this.count = 0;
}

public ImmutableAverager(int total,int count){
this.total = total;
this.count = count;
}

public double average(){
return count> 0? ((双)总数)/ count:0;
}

public ImmutableAverager accept(int i){
return new ImmutableAverager(total + i,count + 1);
}

public ImmutableAverager组合(ImmutableAverager其他){
返回新的ImmutableAverager(total + other.total,count + other.count);


$ / code>

请注意,我调整了 accept combine 返回新的 ImmutableAverager ,而不是改变。 (这些改变也使得这些方法匹配 reduce 的函数参数,所以我们可以使用方法引用。)你可以使用 ImmutableAverager 像这样:

  System.out.println(Stream.of(1,2,3,4,5,6 ,7,8,9,10)
.parallel()
.reduce(new ImmutableAverager(),
ImmutableAverager :: accept,
ImmutableAverager :: combine)
。平均());

使用带有 reduce 的不可变值对象应该给正确的结果并行。最后,请注意 IntStream DoubleStream summaryStatistics()方法和 Collectors averagingDouble averagingInt averagingLong 方法可以为您执行这些计算。然而,我认为这个问题更多的是关于收集和减少的机制,而不是如何简单地进行平均。


I am trying to understand the new Java 8 Stream APIs.

http://docs.oracle.com/javase/tutorial/collections/streams/reduction.html

I found the example of finding average of numbers using collect API. But I felt that, the same can be done using reduce() also.

public class Test {

    public static void main(String[] args) {
        // Using collect
        System.out.println(Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
            .collect(Averager::new, Averager::accept, Averager::combine)
            .average());

        // Using reduce
        System.out.println(Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
            .reduce(new Averager(), (t, u) -> {
                t.accept(u);
                return t;
            }, (t, u) -> {
                t.combine(u);
                return t;
            }).average());
    }

    private static class Averager {
        private int total = 0;
        private int count = 0;

        public Averager() {
            // System.out.println("Creating averager");
        }

        public double average() {
            // System.out.println("Finding average");
            return count > 0 ? ((double) total) / count : 0;
        }

        public void accept(int i) {
            // System.out.println("Accepting " + i);
            total += i;
            count++;
        }

        public void combine(Averager other) {
            // System.out.println("Combining the averager : " + other);
            total += other.total;
            count += other.count;
        }

        @Override
        public String toString() {
            return "[total : " + total + ", count: " + count + "]";
        }
    }
}

1) Is there any reason, that I should use collect instead of reduce here?
2) If I enable all the debug sysouts, I can see that the operations perfomed are exactly the same between, collect and reduce. And the combiner was not being used at all, in both cases.
3) If I make the streams parallel, the collect is always returning me correct result. The reduce() is giving me different results each time.
4) Should I not use reduce, in parallel streams?

Thanks,
Paul

解决方案

The difference between reduce and collect is that collect is an enhanced form of reduction that can deal with mutable objects in parallel. The collect algorithm thread-confines the various result objects, so that they can be mutated safely, even if they aren't thread-safe. That's why Averager works using collect. For sequential computation using reduce this doesn't usually matter, but for parallel computation it will give incorrect results, as you observed.

A key point is that reduce works as long as it is dealing with values but not mutable objects. You can see this by looking at the first argument to reduce. The example code passes new Averager() which is a single object that's used as the identity value by multiple threads in the parallel reduction. The way parallel streams work is that the workload is split into segments that are processed by individual threads. If multiple threads are mutating the same (non-thread-safe) object, it should be clear why this will lead to incorrect results.

It is possible to use reduce to compute an average, but you need to make your accumulation object be immutable. Consider an object ImmutableAverager:

static class ImmutableAverager {
    private final int total;
    private final int count;

    public ImmutableAverager() {
        this.total = 0;
        this.count = 0;
    }

    public ImmutableAverager(int total, int count) {
        this.total = total;
        this.count = count;
    }

    public double average() {
        return count > 0 ? ((double) total) / count : 0;
    }

    public ImmutableAverager accept(int i) {
        return new ImmutableAverager(total + i, count + 1);
    }

    public ImmutableAverager combine(ImmutableAverager other) {
        return new ImmutableAverager(total + other.total, count + other.count);
    }
}

Note that I've adjusted the signatures of accept and combine to return a new ImmutableAverager instead of mutating this. (These changes also make the methods match the function arguments to reduce so we can use method references.) You'd use ImmutableAverager like this:

    System.out.println(Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
            .parallel()
            .reduce(new ImmutableAverager(), 
                    ImmutableAverager::accept,
                    ImmutableAverager::combine)
            .average());

Using immutable value objects with reduce should give the correct results in parallel.

Finally, note that IntStream and DoubleStream have summaryStatistics() methods and Collectors has averagingDouble, averagingInt, and averagingLong methods that can do these computations for you. However, I think the question is more about the mechanics of collection and reduction than about how to do averaging most concisely.

这篇关于使用reduce和collect找到平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆