如何使用类型擦除键创建 HashMap? [英] How do I create a HashMap with type erased keys?

查看:35
本文介绍了如何使用类型擦除键创建 HashMap?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望能够使用各种不同的类型作为 HashMap 中的键,所有这些都将实现 Hash.这似乎应该是可能的:从阅读文档看来,每个 Hasher 都会产生一个 u64 结果,因此它们最终会被简化为一种通用类型.实际上我想做:

I want to be able to use a variety of different types as keys in a HashMap, all of which would implement Hash. This seems like it should be possible: from reading the docs it seems that every Hasher will produce a u64 result, so they eventually get reduced down to a common type. Effectively I want to do:

use std::{collections::HashMap, hash::Hash};

fn x(_: HashMap<Box<dyn Hash>, ()>) {}

我不允许这样做:

error[E0038]: the trait `std::hash::Hash` cannot be made into an object
   --> src/lib.rs:3:9
    |
3   | fn x(_: HashMap<Box<dyn Hash>, ()>) {}
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `std::hash::Hash` cannot be made into an object

看来我可以创建一个Hasher(例如RandomState),用它来手动计算哈希值,然后存储u64 导致 HashMap 但这似乎有点过于复杂.我不想再次取出键值,我只需要能够比较散列值.是否有我不知道的 HashMap 替代方案?还是我以完全错误的方式看待这个问题?

It seems like I could create a Hasher (e.g. RandomState), use that to manually calculate hash values, then store the u64 result in a HashMap<u64, _> but that seems kind of overly complex. I don't ever want to get the key values back out again, I just need to be able to compare the hash values. Is there a HashMap alternative I'm not aware of? Or am I looking at this in a totally wrong way?

推荐答案

似乎我可以创建一个 Hasher(例如 RandomState),用它来手动计算哈希值,然后存储 u64 结果在 HashMap<u64, _> 中,但这似乎有点过于复杂.

It seems like I could create a Hasher (e.g. RandomState), use that to manually calculate hash values, then store the u64 result in a HashMap<u64, _> but that seems kind of overly complex.

不幸的是,这过于简单 - 由于散列函数会丢弃一些信息,因此散列表不仅适用于散列,它们还需要原始键来将其与您插入的键进行比较或寻找.这对于处理两个键产生相同哈希的情况是必要的,这既可能也是允许的.

Unfortunately that's overly simple - since a hash function discards some information, hash tables don't just work on hashes, they also need the original key to compare it with the key you're inserting or looking for. This is necessary to handle the case when two keys produce the same hash, which is both possible and allowed.

话虽如此,您可以在 Rust 中使用异构键实现 Python 样式的哈希表,但您需要将键装箱,即使用 Box<dyn Key> 作为类型擦除键,Key 是您为要用作哈希键的所有具体类型定义的特征.特别是,您需要:

Having said that, you can implement Python-style hash tables with heterogeneous keys in Rust, but you'll need to box the keys, i.e. use Box<dyn Key> as the type-erased key, with Key being a trait you define for all concrete types you want to use as the hash key. In particular, you'll need to:

  • 定义 Key 特征,指定如何比较和散列实际键;
  • (可选)为本身为 HashEq 的类型提供该特征的全面实现,因此用户无需手动为每种类型执行此操作;
  • 使用 trait 提供的方法为 Box 定义 EqHash,制作 Box 可用作 std::collections::HashMap 中的键.
  • define the Key trait that specifies how to compare and hash the actual keys;
  • (optionally) provide a blanket implementation of that trait for types that are themselves Hash and Eq so the user doesn't need to do that for every type manually;
  • define Eq and Hash for Box<dyn Key> using the methods the trait provides, making Box<dyn Key> usable as key in std::collections::HashMap.

Key trait 可以这样定义:

The Key trait could be defined like this:

trait Key {
    fn eq(&self, other: &dyn Key) -> bool;
    fn hash(&self) -> u64;
    // see https://stackoverflow.com/a/33687996/1600898
    fn as_any(&self) -> &dyn Any;
}

这是 Key 的全面实现,适用于 EqHash 的任何类型:

And here is a blanket implementation of Key for any type that is Eq and Hash:

use std::any::{Any, TypeId};
use std::collections::{HashMap, hash_map::DefaultHasher};
use std::hash::{Hash, Hasher};

impl<T: Eq + Hash + 'static> Key for T {
    fn eq(&self, other: &dyn Key) -> bool {
        if let Some(other) = other.as_any().downcast_ref::<T>() {
            return self == other;
        }
        false
    }

    fn hash(&self) -> u64 {
        let mut h = DefaultHasher::new();
        // mix the typeid of T into the hash to make distinct types
        // provide distinct hashes
        Hash::hash(&(TypeId::of::<T>(), self), &mut h);
        h.finish()
    }

    fn as_any(&self) -> &dyn Any {
        self
    }
}

最后,这些实现将使 Box 可用作哈希表键:

Finally, these impls will make Box<dyn Key> usable as hash table keys:

impl PartialEq for Box<dyn Key> {
    fn eq(&self, other: &Self) -> bool {
        Key::eq(self.as_ref(), other.as_ref())
    }
}

impl Eq for Box<dyn Key> {}

impl Hash for Box<dyn Key> {
    fn hash<H: Hasher>(&self, state: &mut H) {
        let key_hash = Key::hash(self.as_ref());
        state.write_u64(key_hash);
    }
}

// just a convenience function to box the key
fn into_key(key: impl Eq + Hash + 'static) -> Box<dyn Key> {
    Box::new(key)
}

一切就绪后,您几乎可以像在动态语言中一样使用键:

With all that in place, you can use the keys almost as you would in a dynamic language:

fn main() {
    let mut map = HashMap::new();
    map.insert(into_key(1u32), 10);
    map.insert(into_key("foo"), 20);
    map.insert(into_key("bar".to_string()), 30);
    assert_eq!(map.get(&into_key(1u32)), Some(&10));
    assert_eq!(map.get(&into_key("foo")), Some(&20));
    assert_eq!(map.get(&into_key("bar".to_string())), Some(&30));
}

游乐场

请注意,此实现将始终认为不同具体类型的值具有不同的值.虽然 Python 的字典会将键 11.0 视为相同的键(而 "1" 将是不同的),但 into_key(1u32) 不仅与 into_key(1.0) 不同,而且与 into_key(1u64) 不同.同样,into_key(foo") 将不同于 into_key(foo".to_string()).这可以通过为您关心的类型手动实现 Key 来改变,在这种情况下,必须删除一揽子实现.

Note that this implementation will always consider values of distinct concrete types to have different values. While Python's dictionaries will consider keys 1 and 1.0 to be the same key (while "1" will be distinct), an into_key(1u32) will be distinct not just from into_key(1.0), but also from into_key(1u64). Likewise, into_key("foo") will be a distinct from into_key("foo".to_string()). This could be changed by manually implementing Key for the types you care about, in which case the blanket implementation must be removed.

这篇关于如何使用类型擦除键创建 HashMap?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆