将多个非数字插入到std :: unordered_set< double>中。 [英] Inserting multiple not-a-numbers into a std::unordered_set<double>

查看:143
本文介绍了将多个非数字插入到std :: unordered_set< double>中。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

IEEE 754标准的后果之一是 std :: unordered_set< double> 的非直观行为,当元素数量不是( NAN s)。

One of consequences of the IEEE 754 standard is the non-intuitive behavior of std::unordered_set<double>, when not-a-number elements (NANs) are inserted.

由于 NAN!= NAN 的事实,遵循以下顺序:

Due to the fact that NAN!=NAN, after the following sequence:

#include <iostream>
#include <cmath>
#include <unordered_set>

int main(){
    std::unordered_set<double> set;
    set.insert(NAN);
    set.insert(NAN);
    std::cout<<"Number of elements "<<set.size()<<"\n";  //there are 2 elements!
}

集中有两个元素实时查看): NAN NAN

我的主要问题是,当 N NAN 被插入到哈希集中,它们都命中相同的哈希值,并且表现为 N 插入哈希集会退化为最坏的运行时间- O(N ^ 2)

Mine main issue with this is, that when N NANs are inserted into the hash-set, they all hit the same hash-bucket and the performance of N inserts into the hash-set degenerates to the worst-case running time - O(N^2).

例如,请参阅问题末尾的清单或在此处-插入 NAN 比普通浮点数要花更多的时间。

For an example, see the listing at the end of the question or here live - inserting NAN takes some order of magnitude more time than a "normal" floating number.

我的问题:是否可以(如果是,如何)以这种方式调整 std :: unordered_set< double> 不论插入的 NAN s( NAN ,- NAN 等)?

My question: is it possible (and if yes - how) to tweak std::unordered_set<double> in such a way, that there is at most one NAN-element in the set, no matter the flavor of inserted NANs (NAN, -NAN and so on)?

列表:

#include <iostream>
#include <cmath>
#include <unordered_set>
#include <chrono>

constexpr int N=5000;
void test_insert(double value)
{
    std::unordered_set<double> s;
    auto begin = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < N; i++) {
        s.insert(value);
    }
    auto end = std::chrono::high_resolution_clock::now();
    std::cout << "Duration: " << (std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin).count() / 1e9) << "\n";
    std::cout << "Number of elements: "<<s.size()<<"\n";
}

int main(){
    std::cout << "Not NAN\n";
    test_insert(1.0);           //takes 0.0001 s
    std::cout << "NAN\n";
    test_insert(NAN);           //takes 0.2 s
}


推荐答案

从您在安德鲁斯答案中的评论,

From your comment in Andrews answer,


我认为此解决方案存在问题:-NAN的散列值不同于NAN,但对于散列-函数h必须成立:如果x == y,那么h(x)== h(y)

I think the problem with this solution: -NAN will have a different hash-value then NAN but for hash-function h must hold: if x==y then also h(x)==h(y)

,因此如果您想 h(-NAN)== h(NAN) ...

This does hash differently, so you need to also define your own hash function if you want h(-NAN) == h(NAN) ...

(来自@Andrew回答)

(augmented from @Andrew answer)

#include <iostream>
#include <cmath>
#include <unordered_set>
#include <chrono>

struct EqualPred
{
    constexpr bool operator()(const double& lhs, const double& rhs) const
    {
        if (std::isnan(lhs) && std::isnan(rhs)) return true;
        return lhs == rhs;
    }
};

template <typename T>
struct Hash
{
    size_t operator()(const T& value) const
    {
        return  std::hash<T>()( std::isnan(value) ? NAN : value);
    }


};

std::unordered_set<double, Hash<double>, EqualPred> s;

constexpr int N=5000;
void test_insert(double value)
{

    auto begin = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < N; i++) {
        s.insert(value);
    }
    auto end = std::chrono::high_resolution_clock::now();
    std::cout << "Duration: " << (std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin).count() / 1e9) << "\n";
    std::cout << "Number of elements: "<<s.size()<<"\n";
}

int main(){
    std::cout << "Not NAN\n";
    test_insert(1.0);           //takes 0.0001 s
    std::cout << "NAN\n";
    test_insert(NAN);  
    test_insert(-NAN);

    std::cout << s.size() << std::endl;
    //takes 0.2 s
}

演示

这篇关于将多个非数字插入到std :: unordered_set&lt; double&gt;中。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆