Stream.map和Collectors.mapping之间的性能差异 [英] Performance difference between Stream.map and Collectors.mapping
问题描述
上次我发现Java 8及更高版本的功能编程的角落,并且在Collectors类中发现了静态方法mapping
.
Last time I was discovering nooks of the functional programming of Java 8 and above and I found out a static method mapping
in Collectors class.
我们有一个像Employee这样的班级:
We have a class Employee like:
@AllArgsConstructor
@Builder
@Getter
public class Employee {
private String name;
private Integer age;
private Double salary;
}
比方说,我们有一个Employee
类的POJO列表,并且我们希望接收一个名为Employees的所有名称的列表.我们有两种类似的方法:
Let's say that we have a POJO list of Employee
class and we want to receive a list of all names of Employees. We have two approaches likes:
List<Employee> employeeList
= Arrays.asList(new Employee("Tom Jones", 45, 15000.00),
new Employee("Harry Andrews", 45, 7000.00),
new Employee("Ethan Hardy", 65, 8000.00),
new Employee("Nancy Smith", 22, 10000.00),
new Employee("Deborah Sprightly", 29, 9000.00));
//IntelliJ suggest replacing the first approach with ```map``` and ```collect```
List<String> collect =
employeeList
.stream()
.collect(
Collectors.mapping(Employee::getName, Collectors.toList()));
List<String> collect1 =
employeeList
.stream()
.map(Employee::getName)
.collect(Collectors.toList());
我知道第一种方法在Stream
上使用终端操作,而第二种在Stream
上使用中间操作,但是我想知道第一种方法的性能是否比第二种方法差,反之亦然.如果您能解释我们的数据源(employeeList)的大小将显着增加的第一种情况可能造成的性能下降,我将不胜感激.
I know that the first approach uses a terminal operation on Stream
and the second one intermediate operation on Stream
but I want to know if the first approach will have worse performance than second and vice-versa. I would be grateful if you could explain the potential performance degradation for the first case when our data source (employeeList) will significantly increase in size.
我创建了一个简单的两个测试用例,这些用例由在一个简单的for循环中生成的记录提供.因此,对于小数据输入而言,使用Stream.map
的传统方法与Collectors.mapping
之间的差异很小.另一方面,在我们大量增加像30000000
这样的数据数量的情况下,Collectors.mapping开始工作得更好一些.为了避免空手输入数据30000000
Collectors.mapping持续56 seconds
进行10次迭代,与@RepeatedTest
相同,对于相同的迭代使用相同的数据输入进行更易识别的方法,例如Stream.map
然后是collect
最后5 second longer
.我知道我的临时测试不是最好的,并且由于JVM优化而无法说明现实,但是我们可以断言,对于大量数据输入,Collectors.mapping
可能更为理想.无论如何,我认为
I created a simple two test cases which were supplied by records generated in a simple for loop. Accordingly for small data input the difference between ,,traditional'' approach with Stream.map
usage and Collectors.mapping
is marginal. On the other hand in a scenario when we are intensively increasing the number of data like 30000000
surprisingly Collectors.mapping starts working a little bit better. So as not to be empty-handed for data input 30000000
Collectors.mapping lasts 56 seconds
for 10 iterations as @RepeatedTest
and with the same data input for the same iteration more recognizable approach like Stream.map
and then collect
last 5 second longer
. I know that my provisional tests are not the best and it cannot illustrate reality due to JVM optimization but we can claim that for huge data input Collectors.mapping
can be more desirable. Anyway, I think that this
推荐答案
我怀疑是否存在有意义的性能差异.您必须对数据进行基准测试才能确定.
I doubt there is a meaningful performance difference. You'd have to benchmark it on your data to know for sure.
请注意,mapping
实际上并不是直接用作收集器,而是用作另一个收集器中的下游收集器:
Note that mapping
isn't actually intended to be used directly as a collector, but rather as a downstream collector within another collector:
mapping()
收集器在多级归约中最有用,例如groupingBy或partitioningBy的下游.
The
mapping()
collectors are most useful when used in a multi-level reduction, such as downstream of a groupingBy or partitioningBy.
在 Effective Java 第三版中也有与此相关的内容(在第46项中,位于第214页的大约2/3处,该段的开头是"counting
方法返回的收集器" ").基本上,它说不要以第一种方式使用像mapping
这样的东西.
There is something in Effective Java 3rd Edition about this too (in Item 46, about 2/3 of the way down page 214, the paragraph starting "The collectors returned by the counting
method"). Basically, it says not to use things like mapping
in the first way you do here.
这篇关于Stream.map和Collectors.mapping之间的性能差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!