Java流合并或减少重复对象 [英] Java stream merge or reduce duplicate objects

查看:116
本文介绍了Java流合并或减少重复对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要通过将所有重复项合并到一个对象中来从可以重复的列表中生成唯一的朋友列表
示例-从不同的社交供稿中提取朋友,并将其放入1个大列表中
1.朋友-[名称:"Johnny Depp",动作:"1970-11-10",来源:"FB",fbAttribute:".."]
2.朋友-[名称:"Christian Bale",动作:"1970-01-01",来源:"LI",liAttribute:".."]
3.朋友-[名称:"Johnny Depp",动作:"1970-11-10",来源:"Twitter",twitterAttribute:".."]
4.朋友-[名称:"Johnny Depp",动作:"1970-11-10",来源:"LinkedIn",liAttribute:".."]
5.朋友-[名称:"Christian Bale",动作:"1970-01-01",来源:"LI",liAttribute:".."]

I need to generate a unique friend list from a list that can have duplicates by merging all duplicate entries into one object
Example - Friends are fetched from different social feeds and put into 1 big list
1. Friend - [name: "Johnny Depp", dob: "1970-11-10", source: "FB", fbAttribute: ".."]
2. Friend - [name: "Christian Bale", dob: "1970-01-01", source: "LI", liAttribute: ".."]
3. Friend - [name: "Johnny Depp", dob: "1970-11-10", source: "Twitter", twitterAttribute: ".."]
4. Friend - [name: "Johnny Depp", dob: "1970-11-10", source: "LinkedIn", liAttribute: ".."]
5. Friend - [name: "Christian Bale", dob: "1970-01-01", source: "LI", liAttribute: ".."]

预期产量
1.朋友-[名称:"Christian Bale",动作:"1970-01-01",liAttribute:"..",fbAttribute:"..",twitterAttribute:".."]
2.朋友-[名称:"Johnny Depp",动作:"1970-11-10",liAttribute:"..",fbAttribute:"..",twitterAttribute:".."]

Expected output
1. Friend - [name: "Christian Bale", dob: "1970-01-01", liAttribute: "..", fbAttribute: "..", twitterAttribute: ".."]
2. Friend - [name: "Johnny Depp", dob: "1970-11-10", liAttribute: "..", fbAttribute: "..", twitterAttribute: ".."]

问题-如何在不使用任何中间容器的情况下合并?我可以轻松地使用中间映射,并在条目的每个值上应用reduce.

Question - How can i merge without using any intermediate container? I can easily use an intermediate map and apply reduce on each value of the entry.

List<Friend> friends;
Map<String, List<Friend>> uniqueFriendMap
    = friends.stream().groupingBy(Friend::uniqueFunction);
List<Friend> mergedFriends = uniqueFriendMap.entrySet()
    .stream()
    .map(entry -> {
           return entry.getValue()
                .stream()
                .reduce((a,b) -> friendMergeFunction(a,b));
    })
    .filter(mergedPlace -> mergedPlace.isPresent())
    .collect(Collectors.toList());

我喜欢不使用中间Map uniqueFriendMap来执行此操作.有什么建议吗?

I like to do this without using the intermediate Map uniqueFriendMap. Any suggestions?

推荐答案

groupingBy操作(或类似的操作)是不可避免的,由该操作创建的Map也会在该操作期间用于查找分组键并找到重复项.但是,您可以将其与减少组元素相结合:

The groupingBy operation (or something similar) is unavoidable, the Map created by the operation is also used during the operation for looking up the grouping keys and finding the duplicates. But you can combine it with the reduction of the group elements:

Map<String, Friend> uniqueFriendMap = friends.stream()
    .collect(Collectors.groupingBy(Friend::uniqueFunction,
        Collectors.collectingAndThen(
            Collectors.reducing((a,b) -> friendMergeFunction(a,b)), Optional::get)));

地图的值已经是产生的独特朋友.如果确实需要List,则可以使用普通的Collection操作创建它:

The values of the map are already the resulting distinct friends. If you really need a List, you can create it with a plain Collection operation:

List<Friend> mergedFriends = new ArrayList<>(uniqueFriendMap.values());

如果第二个操作仍然困扰您,您可以将其隐藏在collect操作中:

If this second operation still annoys you, you can hide it within the collect operation:

List<Friend> mergedFriends = friends.stream()
    .collect(Collectors.collectingAndThen(
        Collectors.groupingBy(Friend::uniqueFunction, Collectors.collectingAndThen(
            Collectors.reducing((a,b) -> friendMergeFunction(a,b)), Optional::get)),
        m -> new ArrayList<>(m.values())));

由于嵌套的收集器表示还原(另请参见此答案),因此我们可以使用toMap代替:

Since the nested collector represents a Reduction (see also this answer), we can use toMap instead:

List<Friend> mergedFriends = friends.stream()
    .collect(Collectors.collectingAndThen(
        Collectors.toMap(Friend::uniqueFunction, Function.identity(),
            (a,b) -> friendMergeFunction(a,b)),
        m -> new ArrayList<>(m.values())));

根据friendMergeFunctionstatic方法还是实例方法,可以将(a,b) -> friendMergeFunction(a,b)替换为DeclaringClass::friendMergeFunctionthis::friendMergeFunction.

Depending on whether friendMergeFunction is a static method or instance method, you may replace (a,b) -> friendMergeFunction(a,b) with DeclaringClass::friendMergeFunction or this::friendMergeFunction.

但是请注意,即使使用您的原始方法,也可以进行多种简化.当仅处理Map的值时,不需要使用entrySet(),这需要您在每个条目上调用getValue().您可以首先处理values().然后,您不需要冗长的input -> { return expression; }语法,因为input -> expression就足够了.由于前面的分组操作的组不能为空,因此过滤步骤已过时.因此,您的原始方法将如下所示:

But note that even within your original approach, several simplifications are possible. When you only process the values of a Map, you don’t need to use the entrySet(), which requires you to call getValue() on each entry. You can process the values() in the first place. Then, you don’t need the verbose input -> { return expression; } syntax, as input -> expression is sufficient. Since the groups of the preceding grouping operation can not be empty, the filter step is obsolete. So your original approach would look like:

Map<String, List<Friend>> uniqueFriendMap
    = friends.stream().collect(Collectors.groupingBy(Friend::uniqueFunction));
List<Friend> mergedFriends = uniqueFriendMap.values().stream()
    .map(group -> group.stream().reduce((a,b) -> friendMergeFunction(a,b)).get())
    .collect(Collectors.toList());

这还不错.如前所述,融合操作不会跳过Map的创建,因为这是不可避免的.它只会跳过表示每个组的List的创建,因为它将把它们缩减为一个单独的Friend.

which is not so bad. As said, the fused operation doesn’t skip the Map creation as that’s unavoidable. It only skips the creations of the Lists representing each group, as it will reduce them to a single Friend in-place.

这篇关于Java流合并或减少重复对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆