是否存在有用的Haskell HashMap / HashTable / Dictionary库? [英] Does a useful Haskell HashMap/HashTable/Dictionary library exist?

查看:168
本文介绍了是否存在有用的Haskell HashMap / HashTable / Dictionary库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



考虑假设的类型:

我想要一个无monad的,不断访问的查询关系数组。

 数据HT kv = ??? 

我想构造一次不可变的结构:

  fromList :: Foldable t,Hashable k => t(k,v) - > HT kv 

我想随后用固定时间访问重复查询它::

  lookup :: Hashable k => HT k v  - > k  - >也许v 

似乎有两个候选库不足:


  • unordered-containers




  • $ h3> unordered-containers

    unordered-containers contains类型为 HashMap 的严格和惰性变体。 HashMap 具有 O(log n)查询,如 lookup 功能。这个查询访问时间似乎是由于构建了 HashMap 类型,这些类型的内部树结构允许 O(log n) < a href =https://hackage.haskell.org/package/unordered-containers-0.2.7.1/docs/Data-HashMap-Lazy.html#v:insert =nofollow> 插入 功能。一个可理解的设计折衷许多用例,但因为我不需要可变的 HashMap ,所以这种折衷妨碍了我的用例。



    散列表



    散列表包含一个 HashTable 类型类和三个具有不同表结构策略的实例类型。这个库的类型定义了一个常量时间 O(1) lookup 函数定义,但它永远嵌入在 ST monad。没有办法冻结有状态的 HashTable 实现,并且没有嵌入状态的查找函数单子。当整个计算被封装在一个状态monad中时,库的类型接口是精心设计的,但这种设计不适合我的使用情况。






    是否存在其他一些库,它定义了可以构造一个不可变常量访问查询 O(1)未嵌入到有状态monad中的关联数组的类型和函数? p>

    是否存在某种方式来包装或修改这些现有的基于散列的库来生成不可变的常量访问查询 O(1)关联数组没有嵌入到有状态monad中?

    解决方案

    您需要的库是... unordered-containers 。或者,如果你愿意的话,可以从 containers 中简单地使用旧的 Data.Map



    unordered-containers 文档解释了为什么您不必担心查找的 O (log n )时间复杂度:


    许多操作的平均复杂度为 O (log n )。这个实现使用了一个大的基数(即16),因此实际上这些操作是不变的。

    这是一种常见的做法的功能数据结构,因为它具有良好的共享属性,同时还具有很好的时间复杂性。即使是非常大的 n ,log 16 仍然会产生非常小的数字,所以您几乎总是可以将这些复杂性视为有效的恒定时间。



    如果这对您的应用程序来说是一个瓶颈,当然,请使用其他方法,但我觉得这种可能性不大。毕竟,log 16 (1,000,000)有点小于5,所以你的查找时间不会很快增长。处理所有这些数据所花费的时间要比查找的开销多得多。



    与往常一样:首先配置文件。如果你不知何故有一个绝对需要世界上最快可能的哈希映射的问题,你可能需要一个命令式哈希映射,但对于我所遇到过的每一种情况,功能函数都能正常工作。


    I'm looking for a monad-free, constant access query O(1) associative array.

    Consider the hypothetical type:

    data HT k v = ???
    

    I want to construct an immutable structure once:

    fromList :: Foldable t, Hashable k => t (k,v) -> HT k v
    

    I want to subsequently query it repeatedly with constant time access::

    lookup :: Hashable k => HT k v -> k -> Maybe v
    

    There appears to be two candidate libraries which fall short:

    unordered-containers

    unordered-containers contains both strict and lazy variants of the type HashMap. Both HashMaps have O(log n) queries as documented by the lookup function. This query access time appears to be due to the construction of the HashMap types, which have an internal tree structure allowing for O(log n) insert functionality. An understandable design trade off for many use-cases, but since I don't need a mutable HashMap this tradeoff hampers my use-case.

    hashtables

    hashtables contains a HashTable type-class and three instance types with varying table constructions strategies. This library's type-class defines a constant time O(1) lookup function definition, but it is eternally embedded in the ST monad. There is no way to "freeze" the stateful HashTable implementations and have a lookup function that is not embedded of a stateful monad. The library's type-class interface is well designed when the entire computation is wrapped in a state monad, but this design is unsuitable for my use-case.


    Does there exist some other library which defines types and functions which can construct an immutable constant access query O(1) associative array that is not embedded in a stateful monad?

    Does there exist some way to wrap or modify these existing hashing-based libraries to produce an immutable constant access query O(1) associative array that is not embedded in a stateful monad?

    解决方案

    The library you want is… unordered-containers. Or just plain old Data.Map from containers, if you’d prefer.

    The note in the unordered-containers documentation explains why you shouldn’t worry about the O(log n) time complexity for lookups:

    Many operations have a average-case complexity of O(log n). The implementation uses a large base (i.e. 16) so in practice these operations are constant time.

    This is a common practice with certain kinds of functional data structures because it allows good sharing properties while also having good time complexities. log16 still produces very small numbers even for very large n, so you can almost always treat those complexities as "effectively constant time".

    If this is ever a bottleneck for your application, sure, go with something else, but I find that highly unlikely. After all, log16(1,000,000) is a little under 5, so your lookup time is not going to grow very quickly. Processing all that data is going to take up much more time than the overhead of the lookup.

    As always: profile first. If you somehow have a problem that absolutely needs the fastest possible hash map in the world, you might need an imperative hash map, but for every case I’ve ever had, the functional ones work just fine.

    这篇关于是否存在有用的Haskell HashMap / HashTable / Dictionary库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆