Data.STM.LinkedList实现 [英] Data.STM.LinkedList implementation

查看:64
本文介绍了Data.STM.LinkedList实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找Data.STM.LinkedList实现以获取高性能的链接列表。查看文档,长度函数在O(n)中运行-为什么?在O(1)中实现它是否有任何实际问题?

I'm looking at Data.STM.LinkedList implementation for a high performance linked list. Looking at the documentation, the length function run in O(n) - why is that ? Was there any real issue to implement it in O(1) ?

以下是源代码
https://hackage.haskell.org/package/stm- linkedlist-0.1.0.0 / docs / src / Data-STM-LinkedList-Internal.html#length

是否可以在O(1)中实现?我是Haskell的新手,所以不确定是否包含一些有关列表的元数据是否有问题。

Is it possible implement it in O(1) ? i'm new to Haskell so I'm not sure if holding some metadata about the list is problematic.

谢谢!

推荐答案

对于第一近似而言,Haskell是一种具有足够表现力的语言,以另一种通用语言实现的算法也可以在Haskell中实现,同时保留渐近性能特征。 (这是一个相当低的门槛。大多数通用语言都具有这种表现力。)

To a first approximation, Haskell is a sufficiently expressive language that any algorithm implemented in another general purpose language can also be implemented in Haskell while preserving the asymptotic performance characteristics. (This is a pretty low bar. Most general-purpose languages are this expressive.)

尤其是,尽管Haskell最自然地支持不可变数据结构,但它对可变数据结构及其算法通常可以完全直接转换为Haskell代码。可能会有一些开销(通常是相当大的开销),并且可变数据结构使用起来要比它们的不可变表亲笨拙得多,但还是有可能的。

In particular, though Haskell most naturally supports immutable data structures, it has sufficient support for mutable data that mutable data structures and their algorithms can usually be fairly directly translated into Haskell code. There may be some overhead (often substantial overhead), and mutable data structures may be significantly more awkward to use than their immutable cousins, but it's still possible.

不过,很重要的一点是,要匹配可变数据结构的C ++实现的 actual (相对于渐近)性能,即使不是不可能,也可能极度困难。达到C ++性能的2-3倍之内是合理的,而达到5-10倍的性能则很容易(见下文)。但是,如果您需要匹配C ++的性能,则最好用C ++编写高性能的变异代码,并使用FFI(外部函数接口)连接到该代码。

As a practical matter, though, matching the actual (as opposed to asymptotic) performance of a C++ implementation of a mutable data structure is likely to prove extremely difficult if not impossible. It may be reasonable to get within 2-3 times the performance of C++, and getting within 5-10 times is pretty easy (see below). However, if you need to match C++ performance, you would be probably better off writing the high performance mutating code in C++ and using the FFI (foreign function interface) to interface to that code.

无论如何,使用O(1) length 的中等性能双链表肯定是可能的,并且维护可变的整个列表的元数据没有根本的困难。 。 stm-链表不提供O(1)长度的原因可能与C ++仅保证的原因相同O(n) std :: list<> :: size 性能在C ++ 11之前。即,双向链接列表的许多实际用法都不需要调用 length / size 并提供O (1)性能会带来额外的簿记成本。

Anyway, a "moderate performance" doubly linked-list with O(1) length is certainly possible, and there's no fundamental difficulty with maintaining mutable list-wide metadata. The reason that stm-linkedlist does not provide an O(1) length is probably the same reason that C++ guaranteed only O(n) std::list<>::size performance before C++11. Namely, many practical uses of doubly-linked lists don't ever need to call length/size, and providing O(1) performance comes with an additional bookkeeping cost.

作为概念证明,以下数据类型足以实现带有O的完全可变的双向链表(1)长度功能。在此,以下划线结尾的类型和标识符仅供内部使用。列表的指针严格(因此没有无限的列表!),但其值却是惰性的。

As a proof of concept, the following data types are sufficient to implement a fully mutable doubly-linked list with an O(1) length function. Here, types and identifiers ending in underscores are for internal use only. The list is strict in its pointers (so no infinite lists!) but lazy in its values.

data List a = List
  { headNode_ :: !(IORef (Node_ a))
  , length_ :: !(IORef Int) }
data Node_ a = Node_
  { prev_ :: !(IORef (Node_ a))
  , next_ :: !(IORef (Node_ a))
  , value_ :: a }

列表类型包含一个指向不完整的指针(即 IORef ) code> headNode 指向列表的开头和结尾(或指向空列表的本身),但具有未定义的值字段。这使其成为不安全的节点值,因此最终用户绝对不能直接访问它。 列表还包含一个指向列表长度值的指针。

The List type contains a pointer (i.e., IORef) to an incomplete headNode that points to the start and end of the list (or to itself for an empty list) but has an undefined value field. That makes this an unsafe node value, so it should never be directly accessible to the end-user. The List also contains a pointer to the list length value.

附加类型 Node (无下划线)用于用其相应的列表装饰节点指针(如注释中的迭代器),以使列表元数据可用于需要它的函数:

An additional type Node (no underscore) is used to decorate a node pointer with its corresponding list (like the "iterator" from the comments), to make the list metadata available to functions that need it:

data Node a = Node
  { node_ :: !(IORef (Node_ a))
  , list_ :: !(List a) }

请注意 List 节点是用于处理列表的面向用户的数据类型。

Note that List and Node are the user-facing data types for working with lists.

您创建一个像这样的列表:

You create an empty list like so:

empty :: IO (List a)
empty = mdo
  n <- newIORef (Node_ n n undefined)
  List n <$> newIORef 0

在给定节点之前和之后的插入操作如下。在这里,不安全的头节点表示会得到回报,因为该算法可以将列表开头和结尾处的插入视为头节点与实际列表节点之间插入的特殊情况。

Insertion before and after a given node works as follows. Here's where the unsafe head node representation pays off, since the algorithm can treat insertion at the beginning and end of the list as special cases of insertion between the head node and an actual list node.

insertBefore :: a -> Node a -> IO (Node a)
insertBefore x Node{node_=rnode2, list_} = do
  Node_{prev_=rnode1} <- readIORef rnode2
  insertBetween_ x list_ rnode1 rnode2

insertAfter :: a -> Node a -> IO (Node a)
insertAfter x Node{node_=rnode1, list_} = do
  Node_{next_=rnode2} <- readIORef rnode1
  insertBetween_ x list_ rnode1 rnode2

insertBetween_ :: a -> List a -> IORef (Node_ a) -> IORef (Node_ a) -> IO (Node a)
insertBetween_ x l rnode1 rnode2 = do
  modifyIORef' (length_ l) succ
  newnode <- newIORef (Node_ rnode1 rnode2 x)
  modifyIORef' rnode1 (\n -> n{next_=newnode})
  modifyIORef' rnode2 (\n -> n{prev_=newnode})
  return $ Node newnode l

由于不允许用户拥有头节点,因此我们需要在开头和结尾处插入其他面向用户的功能列表中的一个:

Since a user isn't allowed to "have" a head node, we need additional user-facing functions to insert at the beginning and end of a list:

prepend :: a -> List a -> IO (Node a)
prepend x l = insertAfter x (Node (headNode_ l) l)

append :: a -> List a -> IO (Node a)
append x l = insertBefore x (Node (headNode_ l) l)

请注意,所有插入都通过 insertBetween _ 进行,这会增加长度值。

Observe that all insertions go through insertBetween_ which is responsible for increasing the length value.

删除操作简单而统一无论是内部节点还是开始或结束节点。所有删除都通过此 delete 函数执行,该函数负责减小长度值。

Deletion is straightforward and uniform whether it's an internal node or one at the start or end. All deletions go through this delete function which is responsible for decreasing the length value.

delete :: Node a -> IO ()
delete Node{node_,list_} = do
  modifyIORef' (length_ list_) pred
  Node_{next_, prev_} <- readIORef node_
  modifyIORef' prev_ (\n -> n{next_=next_})
  modifyIORef' next_ (\n -> n{prev_=prev_})

删除头节点将是一场灾难,但是不允许用户拥有这样的 Node ,因此我们

Deletion of the head node would be a disaster, but users aren't allowed to have such a Node, so we're safe.

如果用户具有 Node ,她可以在列表中来回移动:

If a user has a Node, she can move back and forth through the list:

prev :: Node a -> IO (Maybe (Node a))
prev Node{node_, list_} = do
  Node_{prev_} <- readIORef node_
  return $ maybeNode_ prev_ list_

next :: Node a -> IO (Maybe (Node a))
next Node{node_, list_} = do
  Node_{next_} <- readIORef node_
  return $ maybeNode_ next_ list_

maybeNode_ :: IORef (Node_ a) -> List a -> Maybe (Node a)
maybeNode_ n l =
  if n == headNode_ l
  then Nothing
  else Just (Node n l)

请注意,我们必须注意永远不要给用户头节点,因此 maybeNode _ 会检查并返回 Nothing

Note that we must take care never to give the user the head node, so maybeNode_ here checks for it and returns Nothing instead.

要开始使用,用户可以获取<$的开头或结尾c $ c>列表,使用以下功能(在列表中使用上一个 next 禁止的头节点):

To get started, the user can get the start or end of a List using the following functions (which use prev or next on the forbidden head node):

start :: List a -> IO (Maybe (Node a))
start l = next $ Node (headNode_ l) l

end :: List a -> IO (Maybe (Node a))
end l = prev $ Node (headNode_ l) l

缺少的只是一些其他查询功能:

All that's missing are a few miscellaneous query functions:

value :: Node a -> IO a
value = fmap value_ . readIORef . node_

null :: List a -> IO Bool
null l = (==0) <$> length l

length :: List a -> IO Int
length = readIORef . length_

一些实用程序可以转换为纯列表:

some utilities to convert to plain lists:

toList :: List a -> IO [a]
toList = toList_ next_

toListRev :: List a -> IO [a]
toListRev = toList_ prev_

toList_ :: (Node_ a -> IORef (Node_ a)) -> List a -> IO [a]
toList_ dir l = go =<< readIORef h
  where h = headNode_ l
        go n = do
          if dir n == h then return []
            else do
            n' <- readIORef (dir n)
            (value_ n':) <$> go n'

和一个 Show 实例调试:

instance (Show a) => Show (List a) where
  showsPrec d lst = showParen (d > 10) $ showString "fromList " . showsPrec 11 (unsafePerformIO $ toList lst)

警告:此<$ c如果在完全评估生成的字符串之前对列表进行了更改,则$ c> Show 实例是不安全的,因此只能用于调试(可能已从生产版本中删除)。

WARNING: This Show instance is unsafe if the list is mutated before the generated string is fully evaluated, so it should only be used for debugging (and probably removed from a production version).

而且,由于我们可以删除和重新插入,并不是严格必需的,但是如果没有就地修改元素,就不会有自尊的可变结构:

Also, while it's not strictly necessary since we can delete and re-insert, no self-respecting mutable structure would be complete without in-place modification of elements:

modify :: (a -> a) -> Node a -> IO ()
modify f Node{node_} = modifyIORef' node_ (\n -> n { value_ = f (value_ n) })

这是完整的代码。 (有关用法的示例,请参见定义 ex1 。)欢迎使用它作为自己实施的起点。它未经测试且没有基准测试,除了一些快速测试表明它可能比C ++实现慢大约5-10倍。

Here's the full code. (See the definition ex1 for example usage.) You're welcome to use it as a starting point for your own implementation. It's untested and unbenchmarked, except that a couple of quick tests show that it's probably about 5-10x slower than a C++ implementation.

{-# LANGUAGE NamedFieldPuns, RecursiveDo #-}

module LinkedList
  ( List, Node
  , value, null, length
  , empty, prepend, append, insertBefore, insertAfter, delete, modify
  , prev, next, start, end
  , toList, toListRev
  ) where

import System.IO.Unsafe
import Control.Monad
import Prelude hiding (null, length)

import Data.IORef

data List a = List
  { headNode_ :: !(IORef (Node_ a))
  , length_ :: !(IORef Int) }
data Node a = Node
  { node_ :: !(IORef (Node_ a))
  , list_ :: !(List a) }
data Node_ a = Node_
  { prev_ :: !(IORef (Node_ a))
  , next_ :: !(IORef (Node_ a))
  , value_ :: a }

-- unsafe show instance: remove from production version
instance (Show a) => Show (List a) where
  showsPrec d lst = showParen (d > 10) $ showString "fromList " . showsPrec 11 (unsafePerformIO $ toList lst)

value :: Node a -> IO a
value = fmap value_ . readIORef . node_

null :: List a -> IO Bool
null l = (==0) <$> length l

length :: List a -> IO Int
length = readIORef . length_

empty :: IO (List a)
empty = mdo
  n <- newIORef (Node_ n n undefined)
  List n <$> newIORef 0

prepend :: a -> List a -> IO (Node a)
prepend x l = insertAfter x (Node (headNode_ l) l)

append :: a -> List a -> IO (Node a)
append x l = insertBefore x (Node (headNode_ l) l)

insertBefore :: a -> Node a -> IO (Node a)
insertBefore x Node{node_=rnode2, list_} = do
  Node_{prev_=rnode1} <- readIORef rnode2
  insertBetween_ x list_ rnode1 rnode2

insertAfter :: a -> Node a -> IO (Node a)
insertAfter x Node{node_=rnode1, list_} = do
  Node_{next_=rnode2} <- readIORef rnode1
  insertBetween_ x list_ rnode1 rnode2

insertBetween_ :: a -> List a -> IORef (Node_ a) -> IORef (Node_ a) -> IO (Node a)
insertBetween_ x l rnode1 rnode2 = do
  modifyIORef' (length_ l) succ
  newnode <- newIORef (Node_ rnode1 rnode2 x)
  modifyIORef' rnode1 (\n -> n{next_=newnode})
  modifyIORef' rnode2 (\n -> n{prev_=newnode})
  return $ Node newnode l

delete :: Node a -> IO ()
delete Node{node_,list_} = do
  modifyIORef' (length_ list_) pred
  Node_{next_, prev_} <- readIORef node_
  modifyIORef' prev_ (\n -> n{next_=next_})
  modifyIORef' next_ (\n -> n{prev_=prev_})

modify :: (a -> a) -> Node a -> IO ()
modify f Node{node_} = modifyIORef' node_ (\n -> n { value_ = f (value_ n) })

prev :: Node a -> IO (Maybe (Node a))
prev Node{node_, list_} = do
  Node_{prev_} <- readIORef node_
  return $ maybeNode_ prev_ list_

next :: Node a -> IO (Maybe (Node a))
next Node{node_, list_} = do
  Node_{next_} <- readIORef node_
  return $ maybeNode_ next_ list_

maybeNode_ :: IORef (Node_ a) -> List a -> Maybe (Node a)
maybeNode_ n l =
  if n == headNode_ l
  then Nothing
  else Just (Node n l)

start :: List a -> IO (Maybe (Node a))
start l = next $ Node (headNode_ l) l

end :: List a -> IO (Maybe (Node a))
end l = prev $ Node (headNode_ l) l

toList :: List a -> IO [a]
toList = toList_ next_

toListRev :: List a -> IO [a]
toListRev = toList_ prev_

toList_ :: (Node_ a -> IORef (Node_ a)) -> List a -> IO [a]
toList_ dir l = go =<< readIORef h
  where h = headNode_ l
        go n = do
          if dir n == h then return []
            else do
            n' <- readIORef (dir n)
            (value_ n':) <$> go n'

ex1 :: IO (List Int)
ex1 = do
  t <- empty
  mapM_ (flip prepend t) [10,9..1]
  mapM_ (flip append t) [11..20]
  return t

这篇关于Data.STM.LinkedList实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆