F#ref-mutable vars vs对象字段 [英] F# ref-mutable vars vs object fields

查看:174
本文介绍了F#ref-mutable vars vs对象字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在F#中编写一个解析器,它需要尽可能快(我希望在不到一分钟内解析一个100 MB的文件)。正常情况下,它使用可变变量存储下一个可用的字符和下一个可用的令牌(即词法分析器和解析器都使用一个单位的前瞻)。

I'm writing a parser in F#, and it needs to be as fast as possible (I'm hoping to parse a 100 MB file in less than a minute). As normal, it uses mutable variables to store the next available character and the next available token (i.e. both the lexer and the parser proper use one unit of lookahead).

我的当前部分实现使用局部变量。因为闭包变量不能是可变的(任何人知道这个的原因)。我已经声明他们为ref:

My current partial implementation uses local variables for these. Since closure variables can't be mutable (anyone know the reason for this?) I've declared them as ref:

let rec read file includepath =
    let c = ref ' '
    let k = ref NONE
    let sb = new StringBuilder()
    use stream = File.OpenText file

    let readc() =
        c := stream.Read() |> char
    // etc



我假设这有一些开销但我试图在这里的最大速度),这是一个有点不高兴。最明显的替代方法是创建一个解析器类对象,并将可变变量作为其中的字段。有谁知道哪个可能更快?有什么共识是被认为是更好/更习惯的风格?

I assume this has some overhead (not much, I know, but I'm trying for maximum speed here), and it's a little inelegant. The most obvious alternative would be to create a parser class object and have the mutable variables be fields in it. Does anyone know which is likely to be faster? Is there any consensus on which is considered better/more idiomatic style? Is there another option I'm missing?

推荐答案

您提到本地可变值不能由闭包捕获,因此您需要使用 ref 。原因是在闭包中捕获的可变值需要在堆上分配(因为闭包在堆上分配)。

You mentioned that local mutable values cannot be captured by a closure, so you need to use ref instead. The reason for this is that mutable values captured in the closure need to be allocated on the heap (because closure is allocated on the heap).

F#强制您明确写入(使用 ref )。在C#中,你可以捕获可变变量,但编译器将它转换为一个堆分配的对象后面的字段,因此它将在堆上。

F# forces you to write this explicitly (using ref). In C# you can "capture mutable variable", but the compiler translates it to a field in a heap-allocated object behind the scene, so it will be on the heap anyway.

摘要是:如果你想使用闭包,可变变量需要在堆上分配。

Summary is: If you want to use closures, mutable variables need to be allocated on the heap.

现在,您的实现使用 ref ,它为您使用的每个可变变量创建一个小对象。另一种方法是创建具有多个可变字段的单个对象。使用记录,你可以写:

Now, regarding your code - your implementation uses ref, which creates a small object for every mutable variable that you're using. An alternative would be to create a single object with multiple mutable fields. Using records, you could write:

type ReadClosure = {
  mutable c : char
  mutable k : SomeType } // whatever type you use here

let rec read file includepath = 
  let state = { c = ' '; k = NONE } 
  // ... 
  let readc() = 
    state.c <- stream.Read() |> char 
    // etc...

这可能会更有效,因为'

This may be a bit more efficient, because you're allocating a single object instead of a few objects, but I don't expect the difference will be noticeable.

对于你的代码,还有一个令人困惑的事情 - stream 值将在函数读取后返回,因此调用 stream.Read 可能无效(如果在读取完成后调用 readc )。

There is also one confusing thing about your code - the stream value will be disposed after the function read returns, so the call to stream.Read may be invalid (if you call readc after read completes).

let rec read file includepath =    
  let c = ref ' '    
  use stream = File.OpenText file    
  let readc() =    
    c := stream.Read() |> char    
  readc

let f = read a1 a2
f() // This would fail!



我不太确定你是如何使用 readc ,但这可能是一个问题要考虑。此外,如果你声明它只是一个辅助闭包,你可以重写代码没有闭包(或使用尾递归,它被转换为命令式循环与可变变量显式写),以避免任何分配。

I'm not quite sure how you're actually using readc, but this may be a problem to think about. Also, if you're declaring it only as a helper closure, you could probably rewrite the code without closure (or write it explicitly using tail-recursion, which is translated to imperative loop with mutable variables) to avoid any allocations.

这篇关于F#ref-mutable vars vs对象字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆