'inits'的有效版本 [英] Efficient version of 'inits'

查看:93
本文介绍了'inits'的有效版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

也就是 initsabc== [,a,ab,abc]



inits docs / Data-List.html#g:10rel =noreferrer> Data.List ,但下面我自己写了一个版本:

  myInits = f id 
其中
f start(l:ls)=(start []):( f (start。(l :))ls)
f start [] = [(start [])]

虽然我的版本比标准版本稍微简单一些,但我怀疑它不如效率高。我怀疑当 myInits l 被完全评估时,它需要 O(n ^ 2)空间。例如, myTails tails 的实现:

  myTails a @(_:ls)= a:(myTails ls)
myTails [] = [[]]

这与标准版几乎完全一样,我怀疑可以达到 O(n) space重复使用列表的尾巴。



有人可以解释:


  1. 为什么我的 inits 的版本不好。

  2. 为什么另一种方法更好(标准版本是 Data.List 或您自己的)。


解决方案 c> myInits 使用称为差异列表的技术来实现功能以线性时间构建列表。我相信,但没有检查过,完全评估 myInits 的运行时间是 O(n ^ 2)需要 O(n ^ 2)空格。充分评估 inits 还需要 O(n ^ 2)运行时间和空间。任何版本的 inits 都需要 O(n ^ 2)空格;使用 [] 构建的列表只能共享它们的尾部,并且结果中没有共同的尾部 inits Data.List 中的 inits 版本使用分摊时间 O(1) code>队列,就像在相关答案的后半部分描述的更简单的队列 Snoc在 Data.List 中的源代码中引用的在>缺点(另一个名称)向后 - 能够将一个项目附加到列表的末尾。



简单地尝试使用这些函数时,如果在大型列表中使用稀疏, myInits 会表现令人满意。在我的电脑上,在ghci中, myInits [1 ..] !! 8000000 在几秒内产生结果。不幸的是,我有可怕的低效率实施与ghc 7.8.3一起发布,所以我无法比较 myInits inits 。 p>

严格性



myInits inits myTails tails 之间。当它们应用于 undefined _ | _ 时(发音为bottom,我们使用的另一个符号 undefined )。



inits 的strictness属性 inits(xs ++ _ | _)= inits xs ++ _ | _ ,其中 xs 是空列表 [] 表示 inits 应用于 undefined code>

  inits(xs ++ _ | _)= inits xs ++ _ | _ 
inits([] ++ _ | _)= inits [] ++ _ | _
inits _ | _ = [[]] ++ _ | _
inits _ | _ = []:_ _ | _

我们可以看到这个实验性的

 >头。 inits $ undefined 
[]

myInits

 >没有这个属性可用于空列表或更长的列表。 head $ myInits undefined 
***例外:Prelude.undefined

>取3 $ myInits([1,2] ++ undefined)
[[],[1] ***例外:Prelude.undefined

如果我们知道 myInits 中的 f 在任一分支中产生 start [] 。因此,我们可以延迟模式匹配,直到需要决定接下来做什么为止。

  myInits'= f id 
其中
f start list =(start []):
案例列表
(l:ls) - > f(start。(l :))ls
[] - > []

这种懒惰增加使得 myInits'工作就像 inits

 > head $ myInits'undefined 
[]

>取3 $ myInits'([1,2] ++ undefined)
[[],[1],[1,2]]

同样,您的 myTails Data.List 尾部中的 rel =nofollow noreferrer> / code> tails 产生整个列表作为第一个结果,然后再决定是否有剩余的列表。文档说它服从尾巴_ | _ = _ | _:_ | _ ,但它实际上服从一个更加强大的规则,很难容易描述。


That is, inits "abc" == ["","a","ab","abc"]

There is a standard version of inits in Data.List, but below I have written a version myself:

myInits = f id
 where
  f start (l:ls) = (start []):(f (start . (l:)) ls)
  f start [] = [(start [])]

Whilst my version is quite a bit simpler than the standard version, I suspect it's not as good for efficiency reasons. I suspect that when myInits l is fully evaluated it takes O(n^2) space. Take for example, myTails, an implementation of tails:

myTails a@(_:ls) = a:(myTails ls)
myTails [] = [[]]

Which is almost exactly the same as the standard version and I suspect achieves O(n) space by reusing the tails of the lists.

Could someone explain:

  1. Why my version of inits is bad.
  2. Why another approach is better (either the standard one in Data.List or your own).

解决方案

Your myInits uses a technique called a difference list to make functions that build lists in linear time. I believe, but haven't checked, that the running time for completely evaluating myInits is O(n^2) requiring O(n^2) space. Fully evaluating inits also requires O(n^2) running time and space. Any version of inits will require O(n^2) space; lists built with : and [] can only share their tails, and there are no tails in common among the results of inits. The version of inits in Data.List uses an amortized time O(1) queue, much like the simpler queue described in the second half of a related answer. The Snoc referenced in the source code in Data.List is word play on Cons (another name for :) backwards - being able to append an item to the end of the list.

Briefly experimenting with these functions suggests myInits performs satisfactorily when used sparsely on a large list. On my computer, in ghci, myInits [1..] !! 8000000 yields results in a few seconds. Unfortunately, I have the horrifyingly inefficient implementation that shipped with ghc 7.8.3, so I can't compare myInits to inits.

Strictness

There is one big difference between myInits and inits and between myTails and tails. They have different meanings when applied to undefined or _|_ (pronounced "bottom", another symbol we use for undefined).

inits has the strictness property inits (xs ++ _|_) = inits xs ++ _|_, which, when xs is the empty list [] says that inits will still yield at least one result when applied to undefined

inits (xs ++ _|_) = inits xs ++ _|_
inits ([] ++ _|_) = inits [] ++ _|_
inits        _|_  = [[]]     ++ _|_
inits        _|_  =  []      :  _|_

We can see this experimentally

> head . inits $ undefined
[]

myInits does not have this property either for the empty list or for longer lists.

> head $ myInits undefined
*** Exception: Prelude.undefined

> take 3 $ myInits ([1,2] ++ undefined)
[[],[1]*** Exception: Prelude.undefined

We can fix this if we realize that f in myInits would yield start [] in either branch. Therefore, we can delay the pattern matching until it is needed to decide what to do next.

myInits' = f id
 where
  f start list = (start []):
                 case list of
                     (l:ls) -> f (start . (l:)) ls
                     []     -> []

This increase in laziness makes myInits' work just like inits.

> head $ myInits' undefined
[]

> take 3 $ myInits' ([1,2] ++ undefined)
[[],[1],[1,2]]

Similarly, the difference between your myTails and tails in Data.List is that tails yields the entire list as the first result before deciding if there will be a remainder of the list. The documentation says it obeys tails _|_ = _|_ : _|_, but it actually obeys a much stronger rule that's hard to describe easily.

这篇关于'inits'的有效版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆