'inits'的有效版本 [英] Efficient version of 'inits'
问题描述
也就是 initsabc== [,a,ab,abc]
在 inits docs / Data-List.html#g:10rel =noreferrer> Data.List
,但下面我自己写了一个版本:
myInits = f id
其中
f start(l:ls)=(start []):( f (start。(l :))ls)
f start [] = [(start [])]
虽然我的版本比标准版本稍微简单一些,但我怀疑它不如效率高。我怀疑当 myInits l
被完全评估时,它需要 O(n ^ 2)
空间。例如, myTails
, tails
的实现:
myTails a @(_:ls)= a:(myTails ls)
myTails [] = [[]]
这与标准版几乎完全一样,我怀疑可以达到 O(n)
space重复使用列表的尾巴。
有人可以解释:
- 为什么我的
inits
的版本不好。 - 为什么另一种方法更好(标准版本是
Data.List
或您自己的)。
myInits
的运行时间是 O(n ^ 2)
需要 O(n ^ 2)
空格。充分评估 inits
还需要 O(n ^ 2)
运行时间和空间。任何版本的 inits
都需要 O(n ^ 2)
空格;使用:
和 []
构建的列表只能共享它们的尾部,并且结果中没有共同的尾部 inits
。 Data.List
中的 inits
版本使用分摊时间 O(1) code>队列,就像在相关答案的后半部分描述的更简单的队列。 Snoc在 Data.List
中的源代码中引用的在>缺点
(另一个名称:
)向后 - 能够将一个项目附加到列表的末尾。
简单地尝试使用这些函数时,如果在大型列表中使用稀疏, myInits
会表现令人满意。在我的电脑上,在ghci中, myInits [1 ..] !! 8000000
在几秒内产生结果。不幸的是,我有可怕的低效率实施与ghc 7.8.3一起发布,所以我无法比较 myInits
和 inits
。 p> 严格性
myInits
和 inits
和 myTails
和 tails
之间。当它们应用于 undefined
或 _ | _
时(发音为bottom,我们使用的另一个符号 undefined
)。
inits
的strictness属性 inits(xs ++ _ | _)= inits xs ++ _ | _
,其中 xs
是空列表 []
表示 undefined
inits(xs ++ _ | _)= inits xs ++ _ | _
inits([] ++ _ | _)= inits [] ++ _ | _
inits _ | _ = [[]] ++ _ | _
inits _ | _ = []:_ _ | _
我们可以看到这个实验性的
>头。 inits $ undefined
[]
myInits $ c $
>没有这个属性可用于空列表或更长的列表。 head $ myInits undefined
***例外:Prelude.undefined
>取3 $ myInits([1,2] ++ undefined)
[[],[1] ***例外:Prelude.undefined
如果我们知道 myInits
中的 f
在任一分支中产生 start []
。因此,我们可以延迟模式匹配,直到需要决定接下来做什么为止。
myInits'= f id
其中
f start list =(start []):
案例列表
(l:ls) - > f(start。(l :))ls
[] - > []
这种懒惰增加使得 myInits'
工作就像 inits
。
> head $ myInits'undefined
[]
>取3 $ myInits'([1,2] ++ undefined)
[[],[1],[1,2]]
同样,您的 myTails
和 Data.List 尾部中的 / code>
tails
产生整个列表作为第一个结果,然后再决定是否有剩余的列表。文档说它服从尾巴_ | _ = _ | _:_ | _
,但它实际上服从一个更加强大的规则,很难容易描述。
That is, inits "abc" == ["","a","ab","abc"]
There is a standard version of inits
in Data.List
, but below I have written a version myself:
myInits = f id
where
f start (l:ls) = (start []):(f (start . (l:)) ls)
f start [] = [(start [])]
Whilst my version is quite a bit simpler than the standard version, I suspect it's not as good for efficiency reasons. I suspect that when myInits l
is fully evaluated it takes O(n^2)
space. Take for example, myTails
, an implementation of tails
:
myTails a@(_:ls) = a:(myTails ls)
myTails [] = [[]]
Which is almost exactly the same as the standard version and I suspect achieves O(n)
space by reusing the tails of the lists.
Could someone explain:
- Why my version of
inits
is bad. - Why another approach is better (either the standard one in
Data.List
or your own).
Your myInits
uses a technique called a difference list to make functions that build lists in linear time. I believe, but haven't checked, that the running time for completely evaluating myInits
is O(n^2)
requiring O(n^2)
space. Fully evaluating inits
also requires O(n^2)
running time and space. Any version of inits
will require O(n^2)
space; lists built with :
and []
can only share their tails, and there are no tails in common among the results of inits
. The version of inits
in Data.List
uses an amortized time O(1)
queue, much like the simpler queue described in the second half of a related answer. The Snoc
referenced in the source code in Data.List
is word play on Cons
(another name for :
) backwards - being able to append an item to the end of the list.
Briefly experimenting with these functions suggests myInits
performs satisfactorily when used sparsely on a large list. On my computer, in ghci, myInits [1..] !! 8000000
yields results in a few seconds. Unfortunately, I have the horrifyingly inefficient implementation that shipped with ghc 7.8.3, so I can't compare myInits
to inits
.
Strictness
There is one big difference between myInits
and inits
and between myTails
and tails
. They have different meanings when applied to undefined
or _|_
(pronounced "bottom", another symbol we use for undefined
).
inits
has the strictness property inits (xs ++ _|_) = inits xs ++ _|_
, which, when xs
is the empty list []
says that inits
will still yield at least one result when applied to undefined
inits (xs ++ _|_) = inits xs ++ _|_
inits ([] ++ _|_) = inits [] ++ _|_
inits _|_ = [[]] ++ _|_
inits _|_ = [] : _|_
We can see this experimentally
> head . inits $ undefined
[]
myInits
does not have this property either for the empty list or for longer lists.
> head $ myInits undefined
*** Exception: Prelude.undefined
> take 3 $ myInits ([1,2] ++ undefined)
[[],[1]*** Exception: Prelude.undefined
We can fix this if we realize that f
in myInits
would yield start []
in either branch. Therefore, we can delay the pattern matching until it is needed to decide what to do next.
myInits' = f id
where
f start list = (start []):
case list of
(l:ls) -> f (start . (l:)) ls
[] -> []
This increase in laziness makes myInits'
work just like inits
.
> head $ myInits' undefined
[]
> take 3 $ myInits' ([1,2] ++ undefined)
[[],[1],[1,2]]
Similarly, the difference between your myTails
and tails
in Data.List
is that tails
yields the entire list as the first result before deciding if there will be a remainder of the list. The documentation says it obeys tails _|_ = _|_ : _|_
, but it actually obeys a much stronger rule that's hard to describe easily.
这篇关于'inits'的有效版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!