初始和尾部的空间复杂性是什么? [英] What are the space complexities of inits and tails?

查看:84
本文介绍了初始和尾部的空间复杂性是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在阅读了Okasaki的纯功能数据结构中有关持久性的文章并仔细阅读了他关于单链接列表(这是Haskell列表的实现方式)的示例之后,让我们想知道Data.Listinitstails ...

After reading the passage about persistence in Okasaki's Purely Functional Data Structures and going over his illustrative examples about singly linked lists (which is how Haskell's lists are implemented), I was left wondering about the space complexities of Data.List's inits and tails...

在我看来

  • tails的空间复杂度在其参数长度上为 linear ,并且
  • inits的空间复杂度在其参数长度上为二次方
  • the space complexity of tails is linear in the length of its argument, and
  • the space complexity of inits is quadratic in the length of its argument,

但简单的基准测试则相反.

but a simple benchmark indicates otherwise.

使用tails,可以共享原始列表.计算tails xs仅在于遍历列表xs并创建指向该列表中每个元素的新指针.无需在内存中重新创建xs的一部分.

With tails, the original list can be shared. Computing tails xs simply consists in walking along list xs and creating a new pointer to each element of that list; no need to recreate part of xs in memory.

相反,由于inits xs的每个元素以不同的方式结尾",因此不能进行此类共享,并且必须从头开始在内存中重新创建xs的所有可能的前缀.

In contrast, because each element of inits xs "ends in a different way", there can be no such sharing, and all the possible prefixes of xs must be recreated from scratch in memory.

下面的简单基准测试表明,这两个函数之间的内存分配没有太大区别:

The simple benchmark below shows there isn't much of a difference in memory allocation between the two functions:

-- Main.hs

import Data.List (inits, tails)

main = do
    let intRange = [1 .. 10 ^ 4] :: [Int]
    print $ sum intRange
    print $ fInits intRange
    print $ fTails intRange

fInits :: [Int] -> Int
fInits = sum . map sum . inits

fTails :: [Int] -> Int
fTails = sum . map sum . tails

编译我的Main.hs文件后

ghc -prof -fprof-auto -O2 -rtsopts Main.hs

并运行

./Main +RTS -p

Main.prof文件报告以下内容:

COST CENTRE MODULE  %time %alloc

fInits      Main     60.1   64.9
fTails      Main     39.9   35.0

分配给fInits的内存和分配给fTails的内存具有相同的数量级...哼...

The memory allocated for fInits and that allocated for fTails have the same order of magnitude... Hum...

  • 关于tails(线性)和inits(二次)的空间复杂度的结论是否正确?
  • 如果是这样,为什么GHC会为fInitsfTails分配大致相同的内存? 列表融合是否与此有关?
  • 还是我的基准测试存在缺陷?
  • Are my conclusions about the space complexities of tails (linear) and inits (quadratic) correct?
  • If so, why does GHC allocate roughly as much memory for fInits and fTails? Does list fusion have something to do with this?
  • Or is my benchmark flawed?

推荐答案

Haskell报告中inits的实现,与基于4.7.0.1(GHC 7.8.3)之前使用的实现相同或几乎相同.太慢了.特别是fmap应用程序是递归堆叠的,因此强制结果的连续元素变得越来越慢.

The implementation of inits in the Haskell Report, which is identical to or nearly identical to implementations used up to base 4.7.0.1 (GHC 7.8.3) is horribly slow. In particular, the fmap applications stack up recursively, so forcing successive elements of the result gets slower and slower.

inits [1,2,3,4] = [] : fmap (1:) (inits [2,3,4])
 = [] : fmap (1:) ([] : fmap (2:) (inits [3,4]))
 = [] : [1] : fmap (1:) (fmap (2:) ([] : fmap (3:) (inits [4])))
....

Bertram Felgenhauer探索的最简单的渐近最优实现是基于应用take并带有依次更大的参数:

The simplest asymptotically optimal implementation, explored by Bertram Felgenhauer, is based on applying take with successively larger arguments:

inits xs = [] : go (1 :: Int) xs where
  go !l (_:ls) = take l xs : go (l+1) ls
  go _  []     = []

Felgenhauer使用私有的非融合版本的take可以从中获得一些额外的性能,但是仍然不如预期的快.

Felgenhauer was able to eke some extra performance out of this using a private, non-fusing version of take, but it was still not as fast as it could be.

在大多数情况下,以下非常简单的实现明显更快:

The following very simple implementation is significantly faster in most cases:

inits = map reverse . scanl (flip (:)) []

在某些奇怪的情况下(例如map head . inits),这种简单的实现在渐近上不是最佳的.因此,我基于Chris Okasaki的Banker的队列,使用相同的技术编写了一个版本,该版本既渐近最优,又几乎一样快. Joachim Breitner主要通过使用严格的scanl'而不是通常的scanl对其进行了进一步优化,GHC 7.8.4中引入了此实现. inits现在可以在O(n)时间内产生结果的书脊.强制整个结果需要O(n ^ 2)时间,因为在所有初始段之间都不能共享任何问题.如果您真的想以极快的速度initstails,最好的选择是使用Data.Sequence. Louis Wasserman的实现神奇.另一种可能是使用Data.Vector—大概是对此类事物使用切片.

In some weird corner cases (like map head . inits), this simple implementation is asymptotically non-optimal. I therefore wrote a version using the same technique, but based on Chris Okasaki's Banker's queues, that is both asymptotically optimal and nearly as fast. Joachim Breitner optimized it further, primarily by using a strict scanl' rather than the usual scanl, and this implementation got into GHC 7.8.4. inits can now produce the spine of the result in O(n) time; forcing the entire result requires O(n^2) time because none of the conses can be shared among the different initial segments. If you want really absurdly fast inits and tails, your best bet is to use Data.Sequence; Louis Wasserman's implementation is magical. Another possibility would be to use Data.Vector—it presumably uses slicing for such things.

这篇关于初始和尾部的空间复杂性是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆