是否存在有用的Haskell HashMap / HashTable / Dictionary库? [英] Does a useful Haskell HashMap/HashTable/Dictionary library exist?
问题描述
考虑假设的类型:
我想要一个无monad的,不断访问的查询关系数组。
数据HT kv = ???
我想构造一次不可变的结构:
fromList :: Foldable t,Hashable k => t(k,v) - > HT kv
我想随后用固定时间访问重复查询它::
lookup :: Hashable k => HT k v - > k - >也许v
似乎有两个候选库不足:
-
$h3>
unordered-containers
unordered-containers
contains类型为HashMap
的严格和惰性变体。HashMap
具有 O(log n)查询,如lookup
功能。这个查询访问时间似乎是由于构建了HashMap
类型,这些类型的内部树结构允许 O(log n) < a href =https://hackage.haskell.org/package/unordered-containers-0.2.7.1/docs/Data-HashMap-Lazy.html#v:insert =nofollow>插入
功能。一个可理解的设计折衷许多用例,但因为我不需要可变的HashMap
,所以这种折衷妨碍了我的用例。
散列表
散列表
包含一个HashTable
类型类和三个具有不同表结构策略的实例类型。这个库的类型定义了一个常量时间 O(1)lookup
函数定义,但它永远嵌入在ST
monad。没有办法冻结有状态的
函数单子。当整个计算被封装在一个状态monad中时,库的类型接口是精心设计的,但这种设计不适合我的使用情况。HashTable
实现,并且没有嵌入状态的查找
是否存在其他一些库,它定义了可以构造一个不可变常量访问查询 O(1)未嵌入到有状态monad中的关联数组的类型和函数? p>
是否存在某种方式来包装或修改这些现有的基于散列的库来生成不可变的常量访问查询 O(1)关联数组没有嵌入到有状态monad中?
解决方案您需要的库是...
unordered-containers
。或者,如果你愿意的话,可以从containers
中简单地使用旧的Data.Map
。
unordered-containers
文档解释了为什么您不必担心查找的 O (log n )时间复杂度:
许多操作的平均复杂度为 O (log n )。这个实现使用了一个大的基数(即16),因此实际上这些操作是不变的。
这是一种常见的做法的功能数据结构,因为它具有良好的共享属性,同时还具有很好的时间复杂性。即使是非常大的 n ,log 16 仍然会产生非常小的数字,所以您几乎总是可以将这些复杂性视为有效的恒定时间。
如果这对您的应用程序来说是一个瓶颈,当然,请使用其他方法,但我觉得这种可能性不大。毕竟,log 16 (1,000,000)有点小于5,所以你的查找时间不会很快增长。处理所有这些数据所花费的时间要比查找的开销多得多。
与往常一样:首先配置文件。如果你不知何故有一个绝对需要世界上最快可能的哈希映射的问题,你可能需要一个命令式哈希映射,但对于我所遇到过的每一种情况,功能函数都能正常工作。
I'm looking for a monad-free, constant access query O(1) associative array.
Consider the hypothetical type:
data HT k v = ???
I want to construct an immutable structure once:
fromList :: Foldable t, Hashable k => t (k,v) -> HT k v
I want to subsequently query it repeatedly with constant time access::
lookup :: Hashable k => HT k v -> k -> Maybe v
There appears to be two candidate libraries which fall short:
unordered-containers
unordered-containers
contains both strict and lazy variants of the typeHashMap
. BothHashMap
s have O(log n) queries as documented by thelookup
function. This query access time appears to be due to the construction of theHashMap
types, which have an internal tree structure allowing for O(log n)insert
functionality. An understandable design trade off for many use-cases, but since I don't need a mutableHashMap
this tradeoff hampers my use-case.hashtables
hashtables
contains aHashTable
type-class and three instance types with varying table constructions strategies. This library's type-class defines a constant time O(1)lookup
function definition, but it is eternally embedded in theST
monad. There is no way to "freeze" the statefulHashTable
implementations and have alookup
function that is not embedded of a stateful monad. The library's type-class interface is well designed when the entire computation is wrapped in a state monad, but this design is unsuitable for my use-case.
Does there exist some other library which defines types and functions which can construct an immutable constant access query O(1) associative array that is not embedded in a stateful monad?
Does there exist some way to wrap or modify these existing hashing-based libraries to produce an immutable constant access query O(1) associative array that is not embedded in a stateful monad?
解决方案The library you want is…
unordered-containers
. Or just plain oldData.Map
fromcontainers
, if you’d prefer.The note in the
unordered-containers
documentation explains why you shouldn’t worry about the O(log n) time complexity for lookups:Many operations have a average-case complexity of O(log n). The implementation uses a large base (i.e. 16) so in practice these operations are constant time.
This is a common practice with certain kinds of functional data structures because it allows good sharing properties while also having good time complexities. log16 still produces very small numbers even for very large n, so you can almost always treat those complexities as "effectively constant time".
If this is ever a bottleneck for your application, sure, go with something else, but I find that highly unlikely. After all, log16(1,000,000) is a little under 5, so your lookup time is not going to grow very quickly. Processing all that data is going to take up much more time than the overhead of the lookup.
As always: profile first. If you somehow have a problem that absolutely needs the fastest possible hash map in the world, you might need an imperative hash map, but for every case I’ve ever had, the functional ones work just fine.
这篇关于是否存在有用的Haskell HashMap / HashTable / Dictionary库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!