Stream.map和Collectors.mapping之间的性能差异 [英] Performance difference between Stream.map and Collectors.mapping

查看:190
本文介绍了Stream.map和Collectors.mapping之间的性能差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

上次我发现Java 8及更高版本的功能编程的角落,并且在Collectors类中发现了静态方法mapping.

Last time I was discovering nooks of the functional programming of Java 8 and above and I found out a static method mapping in Collectors class.

我们有一个像Employee这样的班级:

We have a class Employee like:

@AllArgsConstructor
@Builder
@Getter
public class Employee {
  private String name;
  private Integer age;
  private Double salary;
}

比方说,我们有一个Employee类的POJO列表,并且我们希望接收一个名为Employees的所有名称的列表.我们有两种类似的方法:

Let's say that we have a POJO list of Employee class and we want to receive a list of all names of Employees. We have two approaches likes:

    List<Employee> employeeList
        = Arrays.asList(new Employee("Tom Jones", 45, 15000.00),
        new Employee("Harry Andrews", 45, 7000.00),
        new Employee("Ethan Hardy", 65, 8000.00),
        new Employee("Nancy Smith", 22, 10000.00),
        new Employee("Deborah Sprightly", 29, 9000.00));

    //IntelliJ suggest replacing the first approach with ```map``` and ```collect```

    List<String> collect =
        employeeList
        .stream()
        .collect(
            Collectors.mapping(Employee::getName, Collectors.toList()));

    List<String> collect1 =
        employeeList
            .stream()
            .map(Employee::getName)
            .collect(Collectors.toList());

我知道第一种方法在Stream上使用终端操作,而第二种在Stream上使用中间操作,但是我想知道第一种方法的性能是否比第二种方法差,反之亦然.如果您能解释我们的数据源(employeeList)的大小将显着增加的第一种情况可能造成的性能下降,我将不胜感激.

I know that the first approach uses a terminal operation on Stream and the second one intermediate operation on Stream but I want to know if the first approach will have worse performance than second and vice-versa. I would be grateful if you could explain the potential performance degradation for the first case when our data source (employeeList) will significantly increase in size.

我创建了一个简单的两个测试用例,这些用例由在一个简单的for循环中生成的记录提供.因此,对于小数据输入而言,使用Stream.map的传统方法与Collectors.mapping之间的差异很小.另一方面,在我们大量增加像30000000这样的数据数量的情况下,Collectors.mapping开始工作得更好一些.为了避免空手输入数据30000000 Collectors.mapping持续56 seconds进行10次迭代,与@RepeatedTest相同,对于相同的迭代使用相同的数据输入进行更易识别的方法,例如Stream.map然后是collect最后5 second longer.我知道我的临时测试不是最好的,并且由于JVM优化而无法说明现实,但是我们可以断言,对于大量数据输入,Collectors.mapping可能更为理想.无论如何,我认为

I created a simple two test cases which were supplied by records generated in a simple for loop. Accordingly for small data input the difference between ,,traditional'' approach with Stream.map usage and Collectors.mapping is marginal. On the other hand in a scenario when we are intensively increasing the number of data like 30000000 surprisingly Collectors.mapping starts working a little bit better. So as not to be empty-handed for data input 30000000 Collectors.mapping lasts 56 seconds for 10 iterations as @RepeatedTest and with the same data input for the same iteration more recognizable approach like Stream.map and then collect last 5 second longer. I know that my provisional tests are not the best and it cannot illustrate reality due to JVM optimization but we can claim that for huge data input Collectors.mapping can be more desirable. Anyway, I think that this

推荐答案

我怀疑是否存在有意义的性能差异.您必须对数据进行基准测试才能确定.

I doubt there is a meaningful performance difference. You'd have to benchmark it on your data to know for sure.

请注意,mapping实际上并不是直接用作收集器,而是用作另一个收集器中的下游收集器:

Note that mapping isn't actually intended to be used directly as a collector, but rather as a downstream collector within another collector:

mapping()收集器在多级归约中最有用,例如groupingBy或partitioningBy的下游.

The mapping() collectors are most useful when used in a multi-level reduction, such as downstream of a groupingBy or partitioningBy.

Effective Java 第三版中也有与此相关的内容(在第46项中,位于第214页的大约2/3处,该段的开头是"counting方法返回的收集器" ").基本上,它说不要以第一种方式使用像mapping这样的东西.

There is something in Effective Java 3rd Edition about this too (in Item 46, about 2/3 of the way down page 214, the paragraph starting "The collectors returned by the counting method"). Basically, it says not to use things like mapping in the first way you do here.

这篇关于Stream.map和Collectors.mapping之间的性能差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆