性能矢量排序/唯一/擦除与复制到unordered_set [英] Performance of vector sort/unique/erase vs. copy to unordered_set

查看:164
本文介绍了性能矢量排序/唯一/擦除与复制到unordered_set的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个函数,使得网格中的点列表的所有邻居到一定距离,这涉及到很多重复(我的邻居的邻居==我)。

I have a function that gets all neighbours of a list of points in a grid out to a certain distance, which involves a lot of duplicates (my neighbour's neighbour == me again).

我一直在试验几种不同的解决方案,但我不知道哪个是更高效的。下面是一些代码演示两个并行运行的解决方案,一个使用std :: vector sort-unique-erase,另一个使用std :: copy into std :: unordered_set。

I've been experimenting with a couple of different solutions, but I have no idea which is the more efficient. Below is some code demonstrating two solutions running side by side, one using std::vector sort-unique-erase, the other using std::copy into a std::unordered_set.

我还尝试了另一个解决方案,这是将包含邻居的向量传递给邻居函数,这将使用std :: find,以确保邻居在添加它之前不存在。

I also tried another solution, which is to pass the vector containing the neighbours so far to the neighbour function, which will use std::find to ensure a neighbour doesn't already exist before adding it.

这三个解决方案,但我不能完全包装我的头,这将是更快。任何想法?

So three solutions, but I can't quite wrap my head around which is gonna be faster. Any ideas anyone?

代码段如下:

// Vector of all neighbours of all modified phi points, which may initially include duplicates.
std::vector<VecDi> aneighs;
// Hash function, mapping points to their norm distance.
auto hasher = [&] (const VecDi& a) {
    return std::hash<UINT>()(a.squaredNorm() >> 2);
};
// Unordered set for storing neighbours without duplication.
std::unordered_set<VecDi, UINT (*) (const VecDi& a)> sneighs(phi.dims().squaredNorm() >> 2, hasher);

... compute big long list of points including many duplicates ...

// Insert neighbours into unordered_set to remove duplicates.
std::copy(aneighs.begin(), aneighs.end(), std::inserter(sneighs, sneighs.end()));

// De-dupe neighbours list.
// TODO: is this method faster or slower than unordered_set?
std::sort(aneighs.begin(), aneighs.end(), [&] (const VecDi& a, const VecDi&b) {
    const UINT aidx = Grid<VecDi, D>::index(a, phi.dims(), phi.offset());
    const UINT bidx = Grid<VecDi, D>::index(b, phi.dims(), phi.offset());
    return aidx < bidx;
});
aneighs.erase(std::unique(aneighs.begin(), aneighs.end()), aneighs.end());


推荐答案

的输出集合(反过来,这将取决于你的抽样邻居的距离)。

A great deal here is likely to depend on the size of the output set (which, in turn, will depend on how distant of neighbors you sample).

如果它是小,(不超过几十个项目)你的手工滚动集实现使用 std :: vector std :: find 可能保持相当的竞争力。它的问题是它是一个O(N 2 )算法 - 每次你插入一个项目,你必须搜索所有现有的项目,所以每个插入是线性的项目已经在组。

If it's small, (no more than a few dozen items or so) your hand-rolled set implementation using std::vector and std::find will probably remain fairly competitive. Its problem is that it's an O(N2) algorithm -- each time you insert an item, you have to search all the existing items, so each insertion is linear on the number of items already in the set. Therefore, as the set grows larger, its time to insert items grows roughly quadratically.

使用 std :: set 你每个插入只能做大约log 2 (N)比较,而不是N比较。这将总体复杂度从O(N sup)减少到O(N log N)。主要的缺点是它(至少正常情况下)实现为由单独分配的节点构建的树。这通常会减少其引用的位置 - 即,您插入的每个项目将包括数据本身加上一些指针,并且遍历树意味着以下指针。由于它们是单独分配的,因此在树中(当前)相邻的节点在内存中不会相邻,因此您将看到大量的高速缓存未命中。底线:随着项目数量的增加,其速度相当缓慢地增长,所涉及的常量相当大 - 对于少数项目,它将开始相当缓慢(通常比你的手动滚动版本慢很多)。

Using std::set you each insertion has to only do approximately log2(N) comparisons instead of N comparison. That reduces the overall complexity from O(N2) to O(N log N). The major shortcoming is that it's (at least normally) implemented as a tree built up of individually allocated nodes. That typically reduces its locality of reference -- i.e., each item you insert will consist of the data itself plus some pointers, and traversing the tree means following pointers around. Since they're allocated individually, chances are pretty good that nodes that are (currently) adjacent in the tree won't be adjacent in memory, so you'll see a fair number of cache misses. Bottom line: while its speed grows fairly slowly as the number of items increases, the constants involved are fairly large -- for a small number of items, it'll start out fairly slow (typically quite a bit slower than your hand-rolled version).

使用向量/排序/唯一性结合了前面的每一个的一些优点。将项目存储在向量中(每个项目没有额外的指针)通常导致更好的缓存使用 - 相邻索引处的项目也在相邻的存储器位置,因此当插入新项目时,新项目的位置可能已经在缓存中。主要的缺点是,如果你处理的是一个非常大的集合,这可能会使用更多的内存。当一个集合在插入每个项目时消除重复项(即,一个项目只有在不同于集合中已有的项目时才被插入),这将插入所有项目,然后结束删除所有重复。考虑到当前的内存可用性和邻居的数量我会猜测你可能正在访问,我怀疑这是一个主要的缺点在实践中,但在错误的情况下,它可能会导致一个严重的问题 - - 几乎任何使用虚拟内存几乎肯定会使它成为一个净损失。

Using a vector/sort/unique combines some of the advantages of each of the preceding. Storing the items in a vector (without extra pointers for each) typically leads to better cache usage -- items at adjacent indexes are also at adjacent memory locations, so when you insert a new item, chances are that the location for the new item will already be in the cache. The major disadvantage is that if you're dealing with a really large set, this could use quite a bit more memory. Where a set eliminates duplicates as you insert each item (i.e., an item will only be inserted if it's different from anything already in the set) this will insert all the items, then at the end delete all the duplicates. Given current memory availability and the number of neighbors I'd guess you're probably visiting, I doubt this is a major disadvantage in practice, but under the wrong circumstances, it could lead to a serious problem -- nearly any use of virtual memory would almost certainly make it a net loss.

从复杂性的角度来看,最后一个是O(N log N)类似的集。不同的是,使用集合,它更像是O(N log M),其中N是邻居的总数,M是唯一邻居的数量。使用向量,它真的是O(N log N),其中N是(再次)邻居的总数。因此,如果重复的数量非常大,则集合可以具有显着的算法优势。

Looking at the last from a complexity viewpoint, it's going to O(N log N), sort of like the set. The difference is that with the set it's really more like O(N log M), where N is the total number of neighbors, and M is the number of unique neighbors. With the vector, it's really O(N log N), where N is (again) the total number of neighbors. As such, if the number of duplicates is extremely large, a set could have a significant algorithmic advantage.

也可以在纯线性序列中实现集合样结构。这保留了集合仅存储唯一项目的优点,而且还保持向量的参考优势的局部性。这个想法是保持大部分的当前集合排序,所以你可以在log(N)复杂度中搜索。但是,当插入新项目时,您只需将其放在单独的向量中(或现有向量的未排序部分)。

It's also possible to implement a set-like structure in purely linear sequences. This retains the set's advantage of only storing unique items, but also the vector's locality of reference advantage. The idea is to keep most of the current set sorted, so you can search it in log(N) complexity. When you insert a new item, however, you just put it in the separate vector (or an unsorted portion of the existing vector). When you do a new insertion you also do a linear search on those unsorted items.

当这个未排序的部分太大时(对于某些太大的定义),排序这些项目并将它们合并到主组中,然后再次启动相同的序列。如果按照log N(其中N是排序组中的项目数)定义太大,则可以保留整个数据结构的O(N log N)复杂性。当我玩过它,我发现未排序的部分可能比我想象的更大,但它开始导致问题,虽然。

When that unsorted part gets too large (for some definition of "too large") you sort those items and merge them into the main group, then start the same sequence again. If you define "too large" in terms of "log N" (where N is the number of items in the sorted group) you can retain O(N log N) complexity for the data structure as a whole. When I've played with it, I've found that the unsorted portion can be larger than I'd have expected before it starts to cause a problem though.

这篇关于性能矢量排序/唯一/擦除与复制到unordered_set的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆