从HashMap条目列表中删除重复项 [英] Remove Duplicates from List of HashMap Entries

查看:112
本文介绍了从HashMap条目列表中删除重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个列表< HashMap< String,Object>> 它代表一个数据库,其中每个列表记录是一个数据库行。



我的数据库中有10列。有几个行,其中2个特定列的值相等。我需要在列表从数据库的所有行更新后从列表中删除重复项。



有效的方式是什么?



FYI - 我在查询数据库时无法区分,因为在数据库之后,GroupName会在稍后阶段添加到 Map 加载。并且由于Id列不是主键,一旦您将GroupName添加到 Map 中。您将有基于Id + GroupName组合的重复!



希望我的问题有意义。让我知道是否需要更多说明。

解决方案


  1. 创建一个比较HashMaps的Comparator,

  2. 使用 Collections.sort(yourlist,yourcomparator);

  3. 现在,基于您的比较器,彼此相似的所有地图都在列表中相邻。

  4. 创建新列表。

  5. 迭代您的第一个列表,跟踪您最后看到的内容。

  6. 根据您的比较者,您的新列表不应包含重复项。

  7. ol>

    遍历列表的成本是O(n)。排序是O(n log n)。所以这个算法是O(n log n)。



    我们还可以使用TreeSet和比较器来实时排序。插入是O(log n)。我们必须这样做n次。所以我们得到O(n log n)。


    I have a List<HashMap<String,Object>> which represents a database where each list record is a database row.

    I have 10 columns in my database. There are several rows where the values of 2 particular columns are equals. I need to remove the duplicates from the list after the list is updated with all the rows from database.

    What is the efficient way?

    FYI - I am not able to do distinct while querying the database, because the GroupName is added at a later stage to the Map after the database is loaded. And since Id column is not primary key, once you add GroupName to the Map. You will have duplicates based on Id + GroupName combination!

    Hope my question makes sense. Let me know if we need more clarification.

    解决方案

    1. create a Comparator that compares HashMaps, and compares them by comparing the key/value pairs you are interested in.
    2. use Collections.sort(yourlist, yourcomparator);
    3. Now all maps that are similar to each other, based on your comparator, are adjacent in the list.
    4. Create a new list.
    5. Iterate through your first list, keeping track of what you saw last. If the current value is different than the last, add this to your new list.
    6. You new list should contain no duplicates according to your comparator.

    The cost of iterating through the list is O(n). Sorting is O(n log n). So this algorithm is O(n log n).

    We could also sort on-the-fly by using a TreeSet with that comparator. Inserts are O(log n). And we have to do this n times. So we get O(n log n).

    这篇关于从HashMap条目列表中删除重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆