C ++ std :: map或std :: set-有效插入重复项 [英] C++ std::map or std::set - efficiently insert duplicates

查看:312
本文介绍了C ++ std :: map或std :: set-有效插入重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一堆充满重复项的数据,我想消除重复项.你知道,例如[1、3、5、5、5、5、7]变为[1、3、5、7].

I have a bunch of data full of duplicates and I want to eliminate the duplicates. You know, e.g. [1, 1, 3, 5, 5, 5, 7] becomes [1, 3, 5, 7].

看来我可以使用std :: map或std :: set来处理此问题.但是,我不确定(a)将所有值简单地插入到容器中,还是(b)检查它们是否已经存在于容器中,仅在不存在时才插入,是否更快-插入是否非常有效?即使有更好的方法...您能建议一种快速的方法吗?

It looks like I can use either std::map or std::set to handle this. However I'm not sure whether it's faster to (a) simply insert all the values into the container, or (b) check whether they already exist in the container and only insert if they don't - are inserts very efficient? Even if there's a better way... can you suggest a fast way to do this?

另一个问题-如果我存储在其中的数据不像整数那么琐碎,而是一个自定义类,那么std :: map如何设法正确存储(散列?)数据以进行快速访问通过运算符[]?

Another question - if the data I'm storing in them isn't as trivial as integers, and instead is a custom class, how does the std::map manage to properly store (hash?) the data for fast access via operator[]?

推荐答案

std::map不使用哈希. std::unordered_map可以,但是是C ++ 11. std::mapstd::set都使用您提供的比较器.类模板具有此比较器的默认值,可以归结为operator<比较,但是您可以提供自己的比较器.

std::map doesn't use hashing. std::unordered_map does, but that's C++11. std::map and std::set both use a comparator that you provide. The class templates have defaults for this comparator, which boils down to an operator< comparison, but you can provide your own.

如果不需要存储键和值(看起来好像不需要),则应该使用std::set,因为这样更合适.

If you don't need both a key and a value to be stored (looks like you don't) you should just use a std::set, as that's more appropriate.

该标准没有说出mapset在后台使用什么数据结构,只是说出证明动作具有一定的时间复杂性.实际上,我知道的大多数实现都使用树.

The Standard doesn't say what data structures maps and sets use under the hood, only that certian actions have certain time complexities. In reality, most implementations I'm aware of use a tree.

如果使用operator[]insert,在时间复杂度方面没有区别,但是在执行search之前,如果要使用insert,我会先使用insertoperator[]找不到.后者意味着要进行两次单独的搜索才能将一个项目插入到集合中.

It makes no difference time-complexity-wise if you use operator[] or insert, but I would use insert or operator[] before I did a search followed by an insert if the item isn't found. The later would imply two seperate searches to insert an item in to the set.

这篇关于C ++ std :: map或std :: set-有效插入重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆