在列表中查找忽略字段的重复项 [英] Finding duplicates in a List ignoring a field

查看:102
本文介绍了在列表中查找忽略字段的重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个列表的人,我想查找重复的条目,除了 id 之外的所有字段。所以使用 equals() -method(因此 List.contains()),因为它们取code> id 考虑。

I've got a List of Persons and I want to find duplicate entries, consindering all fields except id. So using the equals()-method (and in consequence List.contains()), because they take id into consideration.

public class Person {
    private String firstname, lastname;
    private int age;
    private long id;
}

修改 equals() hashCode() -methods忽略 id 字段不是一个选项,因为代码的其他部分依赖在这个。

Modifying the equals() and hashCode()-methods to ignore the id field are not an option because other parts of the code rely on this.

如果我想忽略 id ,Java中最有效的方法是整理重复项。字段?

What's the most efficient way in Java to sort out the duplicates if I want to ignore the id field?

推荐答案

构建 比较器< Person> 来实现自然键排序,基于搜索的重复数据删除。 TreeSet 将为您提供开箱即用的能力。

Build a Comparator<Person> to implement your natural-key ordering and then use a binary-search based deduplication. TreeSet will give you this ability out of the box.

请注意, 比较器< T> .compare(a,b)必须满足通常的反对称性,传递性,一致性和反身性要求或二进制搜索订购将失败。您还应使其为空感知(例如,如果一个的firstname字段,其他或两个都为null)。

Note that Comparator<T>.compare(a, b) must fulfil the usual antisymmetry, transitivity, consistency and reflexivity requirements or the binary search ordering will fail. You should also make it null-aware (e.g. if the firstname field of one, other or both are null).

您的Person类的一个简单的自然键比较器如下(它是一个静态成员类,如果您有每个字段的访问器没有显示)。

A simple natural-key comparator for your Person class is as follows (it is a static member class as you haven't shown if you have accessors for each field).

public class Person {
    public static class NkComparator implements Comparator<Person>
    {
        public int compare(Person p1, Person p2)
        {
            if (p1 == null || p2 == null) throw new NullPointerException();
            if (p1 == p2) return 0;
            int i = nullSafeCompareTo(p1.firstname, p2.firstname);
            if (i != 0) return i;
            i = nullSafeCompareTo(p1.lastname, p2.lastname);
            if (i != 0) return i;
            return p1.age - p2.age;
        }
        private static int nullSafeCompareTo(String s1, String s2)
        {
            return (s1 == null)
                    ? (s2 == null) ? 0 : -1
                    : (s2 == null) ? 1 : s1.compareTo(s2);
        }
    }
    private String firstname, lastname;
    private int age;
    private long id;
}

然后可以使用它来生成唯一的列表。使用 添加 方法,返回 true ,当且仅当元素不存在于集合中时:

You can then use it to generate a unique list. Use the add method which returns true if and only if the element didn't already exist in the set:

List<Person> newList = new ArrayList<Person>();
TreeSet<Person> nkIndex = new TreeSet<Person>(new Person.NkComparator());
for (Person p : originalList)
    if (nkIndex.add(p)) newList.add(p); // to generate a unique list

或交换此行的最后一行以输出重复的代码



or swap the final line for this line to output duplicates instead

    if (nkIndex.add(p)) newList.add(p); 

无论你做什么,不要使用删除在您列举的原始列表中,这就是为什么这些方法将您的独特元素添加到新列表。

Whatever you do, don't use remove on your original list while you are enumerating it, that's why these methods add your unique elements to a new list.

如果您只对一个唯一的列表感兴趣,想要尽可能少地使用行:

If you are just interested in a unique list, and want to use as few lines as possible:

TreeSet<Person> set = new TreeSet<Person>(new Person.NkComparator());
set.addAll(originalList);
List<Person> newList = new ArrayList<Person>(set);

这篇关于在列表中查找忽略字段的重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆