避免堆栈溢出(使用F#无限序列序列) [英] Avoiding stack overflow (with F# infinite sequences of sequences)

查看:82
本文介绍了避免堆栈溢出(使用F#无限序列序列)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有我为f#中的morris seq编写的学习代码",该代码遭受堆栈溢出的困扰,我不知道如何避免. "morris"返回无限个见见"序列(即{{1},{1,1},{2,1},{1,2,1,1},{1,1,1 ,2,2,1},{3,1,2,2,1,1},...}).

I have this "learning code" I wrote for the morris seq in f# that suffers from stack overflow that I don't know how to avoid. "morris" returns an infinite sequence of "see and say" sequences (i.e., {{1}, {1,1}, {2,1}, {1,2,1,1}, {1,1,1,2,2,1}, {3,1,2,2,1,1},...}).

    let printList l =
        Seq.iter (fun n -> printf "%i" n) l
        printfn ""

    let rec morris s = 
        let next str = seq {
            let cnt = ref 1  // Stack overflow is below when enumerating
            for cur in [|0|] |> Seq.append str |> Seq.windowed 2 do
                if cur.[0] <> cur.[1] then
                    yield!( [!cnt ; cur.[0]] )
                    cnt := 0
                incr cnt
        }
        seq {
        yield s
        yield! morris (next s) // tail recursion, no stack overflow
        }

    // "main"
    // Print the nth iteration
    let _ =  [1] |> morris |> Seq.nth 3125 |> printList 

您可以使用Seq.nth进行第n次迭代,但是您只能等到堆栈溢出为止.我拥有的一点递归是尾递归,从本质上讲,它构建了一组链接的枚举器.那不是问题所在.这是在第4000个序列上调用枚举"的时候.请注意,这与F#1.9.6.16一起使用,以前的版本在14000以上排名最高.这是因为链接序列的解析方式.序列是惰性的,因此递归"是惰性的.也就是说,seq n调用seq n-1,再调用seq n-2,依此类推以获取第一项(最开始的#是最坏的情况).

You can pick off the nth iteration using Seq.nth but you can only get so far before you hit a stack overflow. The one bit of recursion I have is tail recursion and it in essence builds a linked set of enumerators. That's not where the problem is. It's when "enum" is called on the say the 4000th sequence. Note that's with F# 1.9.6.16, the previous version topped out above 14000). It's because the way the linked sequences are resolved. The sequences are lazy and so the "recursion" is lazy. That is, seq n calls seq n-1 which calls seq n-2 and so forth to get the first item (the very first # is the worst case).

我知道[|0|] |> Seq.append str |> Seq.windowed 2使我的问题变得更糟,如果我消除了这个问题,我可以将生成的#数提高三倍.实际上,该代码运行良好. morris的第3125次迭代的长度将超过10 ^ 359个字符.

I understand that [|0|] |> Seq.append str |> Seq.windowed 2, is making my problem worse and I could triple the # I could generate if I eliminated that. Practically speaking the code works well enough. The 3125th iteration of morris would be over 10^359 characters in length.

我真正要解决的问题是如何保留惰性评估,并且对基于栈的大小没有限制,可以进行迭代.我正在寻找适当的F#惯用法来根据内存大小进行限制.

The problem I'm really trying to solve is how to retain the lazy eval and have a no limit based on stack size for the iteration I can pick off. I'm looking for the proper F# idiom to make the limit based on memory size.

更新10年10月

在学习了F#之后,又有了一点点Haskell的思考和思考.经过一年多的调查,我终于可以回答我自己的问题了.但是,像往常一样,遇到棘手的问题,问题始于它是一个错误的问题.问题不在于序列的序列-确实是由于递归定义的序列.我的函数式编程技能现在有所提高,因此更容易查看下面的版本的运行情况,该版本仍会出现stackoverflow

After learning F# a bit better, a tiny bit of Haskell, thinking & investigating this problem for over year, I finally can answer my own question. But as always with difficult problems, the problem starts with it being the wrong question. The problem isn't sequences of sequences - it's really because of a recursively defined sequence. My functional programming skills are a little better now and so it's easier to see what's going on with the version below, which still gets a stackoverflow

let next str = 
    Seq.append str [0]
    |> Seq.pairwise
    |> Seq.scan (fun (n,_) (c,v) ->
            if (c = v) then (n+1,Seq.empty)
            else (1,Seq.ofList [n;c]) ) (1,Seq.empty)
    |> Seq.collect snd

let morris = Seq.unfold(fun sq -> Some(sq,next sq))

从根本上来说,这会创建很长的Seq处理函数调用链,以生成序列. F#随附的Seq模块是不使用堆栈就无法遵循链的内容.有一个用于追加和递归定义的序列的优化,但是该优化仅在递归实现追加时有效.

That basicially creates a really long chain of Seq processing function calls to generate the sequnces. The Seq module that comes with F# is what can't follow the chain without using the stack. There's an optimization it uses for append and recursively defined sequences, but that optimization only works if the recursion is implementing an append.

这将起作用

let rec ints n = seq { yield n; yield! ints (n+1) }
printf "%A" (ints 0 |> Seq.nth 100000);;

这将导致堆栈溢出.

let rec ints n = seq { yield n; yield! (ints (n+1)|> Seq.map id) }
printf "%A" (ints 0 |> Seq.nth 100000);;

为证明F#库是问题所在,我编写了自己的Seq模块,该模块实现了使用连续的追加,成对,扫描和收集的功能,现在我可以开始生成并打印出50,000个seq了(没有问题(永远不会完成)因为它的长度超过10 ^ 5697个数字.

To prove the F# libary was the issue, I wrote my own Seq module that implemented append, pairwise, scan and collect using continutions and now I can begin generating and printing out the 50,000 seq without a problem (it'll never finish since it's over 10^5697 digits long).

一些附加说明:

  • 延续是我一直在寻找的习惯用法,但是在这种情况下,它们必须进入F#库,而不是我的代码.我从 Tomas Petricek的 实际功能编程一书中了解了F#的延续.
  • 我接受的懒惰清单答案是另一种成语;懒惰的评价.在重写的库中,我还必须利用惰性类型来避免stackoverflow.
  • 运气的惰性列表版本排序方式(可能是设计使然,但这超出了我当前的确定能力)-在构造和迭代时使用的活动模式匹配使列表在所需的递归得到之前也要计算值很深,所以它很懒,但不是那么懒,它需要继续以避免堆栈溢出.例如,当第二个序列需要第一个序列中的数字时,它已经被计算出来了.换句话说,LL版本对于序列生成并不是严格的JIT懒惰,只是列表管理.
  • Continuations were the idiom I was looking for, but in this case, they had to go into the F# library, not my code. I learned about continuations in F# from Tomas Petricek's Real-World Functional Programming book.
  • The lazy list answer that I accepted held the other idiom; lazy evaluation. In my rewritten library, I also had to leverage the lazy type to avoid stackoverflow.
  • The lazy list version sorta of works by luck (maybe by design but that's beyond my current ability to determine) - the active-pattern matching it uses while it's constructing and iterating causes the lists to calculate values before the required recursion gets too deep, so it's lazy, but not so lazy it needs continuations to avoid stackoverflow. For example, by the time the 2nd sequence needs a digit from the 1st sequence, it's already been calculated. In other words, the LL version is not strictly JIT lazy for sequence generation, only list management.

推荐答案

您一定要签出

但是我稍后会尝试发布更全面的答案.

but I will try to post a more comprehensive answer later.

更新

好的,下面是一个解决方案.它将Morris序列表示为int的LazyLists的LazyList,因为我想您希望它在两个方向"上都是惰性的.

Ok, a solution is below. It represents the Morris sequence as a LazyList of LazyLists of int, since I presume you want it to be lazy in 'both directions'.

F#LazyList(位于FSharp.PowerPack.dll中)具有三个有用的属性:

The F# LazyList (in the FSharp.PowerPack.dll) has three useful properties:

  • 这是懒惰的(第一个元素的求值直到第一次需要时才进行评估)
  • 它不会重新计算(在同一对象实例上重新计算第n个元素不会重新计算-它会在第一次计算后缓存每个元素)
  • 您可以忘记"前缀(当您拖尾"进入列表时,可以使用不再引用的前缀进行垃圾回收)

第一个属性与seq(IEnumerable)相同,但是另两个属性对于LazyList来说是唯一的,并且对于计算问题(例如在此问题中提出的问题)非常有用.

The first property is common with seq (IEnumerable), but the other two are unique to LazyList and very useful for computational problems such as the one posed in this question.

事不宜迟,代码:

// print a lazy list up to some max depth
let rec PrintList n ll =
    match n with
    | 0 -> printfn ""
    | _ -> match ll with
           | LazyList.Nil -> printfn ""
           | LazyList.Cons(x,xs) ->
               printf "%d" x
               PrintList (n-1) xs

// NextMorris : LazyList<int> -> LazyList<int>
let rec NextMorris (LazyList.Cons(cur,rest)) = 
    let count = ref 1
    let ll = ref rest
    while LazyList.nonempty !ll && (LazyList.hd !ll) = cur do
        ll := LazyList.tl !ll
        incr count
    LazyList.cons !count
        (LazyList.consf cur (fun() ->
            if LazyList.nonempty !ll then
                NextMorris !ll
            else
                LazyList.empty()))

// Morris : LazyList<int> -> LazyList<LazyList<int>>
let Morris s =
    let rec MakeMorris ll =
        LazyList.consf ll (fun () ->
            let next = NextMorris ll
            MakeMorris next
        )
    MakeMorris s

// "main"
// Print the nth iteration, up to a certain depth
[1] |> LazyList.of_list |> Morris |> Seq.nth 3125 |> PrintList 10
[1] |> LazyList.of_list |> Morris |> Seq.nth 3126 |> PrintList 10
[1] |> LazyList.of_list |> Morris |> Seq.nth 100000 |> PrintList 35
[1] |> LazyList.of_list |> Morris |> Seq.nth 100001 |> PrintList 35

UPDATE2

如果您只想计数,那也可以:

If you just want to count, that's fine too:

let LLLength ll =
    let rec Loop ll acc =
        match ll with
        | LazyList.Cons(_,rest) -> Loop rest (acc+1N)
        | _ -> acc
    Loop ll 0N

let Main() =
    // don't do line below, it leaks
    //let hundredth = [1] |> LazyList.of_list |> Morris |> Seq.nth 100
    // if we only want to count length, make sure we throw away the only
    // copy as we traverse it to count
    [1] |> LazyList.of_list |> Morris |> Seq.nth 100
        |> LLLength |> printfn "%A" 
Main()    

内存使用率保持稳定(我的盒子上不到16M)...尚未完成运行,但是即使在我的慢盒子上,我也能快速计算出第55个长度,所以我认为这应该很好.还要注意,我使用了'bignum'作为长度,因为我认为这会溢出'int'.

The memory usage stays flat (under 16M on my box)... hasn't finished running yet, but I computed the 55th length fast, even on my slow box, so I think this should work just fine. Note also that I used 'bignum's for the length, since I think this will overflow an 'int'.

这篇关于避免堆栈溢出(使用F#无限序列序列)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆