apple/swift 中的 Swift 函数对象包装器 [英] Swift function object wrapper in apple/swift

查看:43
本文介绍了apple/swift 中的 Swift 函数对象包装器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

阅读后:

我了解到 Swift 函数指针是由 swift_func_wrapperswift_func_object 包裹的(根据 2014 年的文章).

I understood that Swift function pointer is wrapped by swift_func_wrapper and swift_func_object (according to the article in 2014).

我想这在 Swift 3 中仍然有效,但我在 https://github 中找不到哪个文件.com/apple/swift 最好地描述了这些结构.

I guess this still works in Swift 3, but I couldn't find which file in https://github.com/apple/swift best describes these structs.

有人可以帮我吗?

推荐答案

相信这些细节主要是Swift 的 IRGen – 我认为您不会在源代码中找到任何友好的结构,向您展示各种 Swift 函数值的完整结构.因此,如果您想对此进行深入研究,我建议您检查编译器发出的 IR.

I believe these details are mainly part of the implementation of Swift's IRGen – I don't think you'll find any friendly structs in the source showing you the full structure of various Swift function values. Therefore if you want to do some digging into this, I would recommend examining the IR emitted by the compiler.

您可以通过运行以下命令来执行此操作:

You can do this by running the command:

xcrun swiftc -emit-ir main.swift | xcrun swift-demangle > main.irgen

它将为 -Onone 构建发出 IR(带有去角符号).您可以在此处找到 LLVM IR 的文档.

which will emit the IR (with demangled symbols) for a -Onone build. You can find the documentation for LLVM IR here.

以下是我自己在 Swift 3.1 版本中通过 IR 学习到的一些有趣的东西.请注意,这全部可能会在未来的 Swift 版本中发生变化(至少在 Swift ABI 稳定之前).不用说,下面给出的代码示例仅用于演示目的;并且不应在实际生产代码中使用.

The following is some interesting stuff that I've been able to learn from going through the IR myself in a Swift 3.1 build. Note that this is all subject to change in future Swift versions (at least until Swift is ABI stable). It goes without saying that the code examples given below are only for demonstration purposes; and shouldn't ever be used in actual production code.

在非常基础的层面上,Swift 中的函数值很简单——它们在 IR 中被定义为:

At a very basic level, function values in Swift are simple things – they're defined in the IR as:

%swift.function = type { i8*, %swift.refcounted* }

它是原始函数指针 i8*,以及一个指向它的 context %swift.refcounted* 的指针,其中 %swift.refcounted 定义为:

which is the raw function pointer i8*, along with a pointer to its context %swift.refcounted*, where %swift.refcounted is defined as:

%swift.refcounted = type { %swift.type*, i32, i32 }

这是一个简单的引用计数对象的结构,包含一个指向对象元数据的指针,以及两个 32 位值.

which is the structure of a simple reference-counted object, containing a pointer to the object's metadata, along with two 32 bit values.

这两个 32 位值用于对象的引用计数.它们一起可以表示(从 Swift 4 开始):

These two 32 bit values are used for the reference count of the object. Together , they can either represent (as of Swift 4):

  • 对象的强引用和无主引用计数 + 一些标志,包括对象是否使用原生 Swift 引用计数(与 Obj-C 引用计数相对),以及对象是否有边表.

  • 一个指向侧表的指针,其中包含上述内容,以及对象的弱引用计数(在形成对对象的弱引用时,如果它还没有侧表,则会创建一个).

要进一步阅读 Swift 引用计数的内部结构,Mike Ash 有一个 有关该主题的精彩博文.

For further reading on the internals of Swift reference counting, Mike Ash has a great blog post on the subject.

函数的上下文通常会在这个 %swift.refcounted 结构的末尾添加额外的值.这些值是函数在被调用时需要的动态事物(例如它已捕获的任何值,或已部分应用的任何参数).在很多情况下,函数值不需要上下文,因此指向上下文的指针将是 nil.

The context of a function usually adds extra values onto the end of this %swift.refcounted structure. These values are dynamic things that the function needs upon being called (such as any values that it has captured, or any parameters that it has been partially applied with). In quite a few cases, function values won't need a context, so the pointer to the context will simply be nil.

当函数被调用时,Swift 会简单地将上下文作为最后一个参数传入.如果函数没有上下文参数,则调用约定似乎允许它无论如何都可以安全地传递.

When the function comes to be called, Swift will simply pass in the context as the last parameter. If the function doesn't have a context parameter, the calling convention appears to allow it to be safely passed anyway.

将函数指针与上下文指针一起存储称为函数值,
这也是 Swift 通常存储已知类型的函数值的方式(而不是 Thin 函数值,它只是函数指针).

The storing of the function pointer along with the context pointer is called a thick function value,
and is how Swift usually stores function values of known type (as opposed to a thin function value which is just the function pointer).

所以,这就解释了为什么 MemoryLayout<(Int) ->Int>.size 返回 16 个字节——因为它由两个指针组成(每个指针的长度都是一个字,即 64 位平台上的 8 个字节).

So, this explains why MemoryLayout<(Int) -> Int>.size returns 16 bytes – because it's made up of two pointers (each being a word in length, i.e 8 bytes on a 64 bit platform).

当厚函数值被传递给函数参数(这些参数是非泛型类型)时,Swift 似乎将原始函数指针和上下文作为单独的参数传递.

When thick function values are passed into function parameters (where those parameters are of non-generic type), Swift appears to pass the raw function pointer and context as separate parameters.

当一个闭包捕获一个值时,这个值将被放入一个堆分配的盒子中(尽管在非转义闭包的情况下,该值本身可以得到堆栈提升——见后面的部分).该框将通过上下文对象(相关IR)对函数可用.

When a closure captures a value, this value will be put into a heap-allocated box (although the value itself can get stack-promoted in the case of a non-escaping closure – see later section). This box will be available to the function through the context object (the relevant IR).

对于只捕获单个值的闭包,Swift 只是让盒子本身成为函数的上下文(不需要额外的间接调用).所以你将有一个函数值,它看起来像一个来自以下结构的 ThickFunction>:

For a closure that just captures a single value, Swift just makes the box itself the context of the function (no need for extra indirection). So you'll have a function value which looks like a ThickFunction<Box<T>> from the following structures:

// The structure of a %swift.function.
struct ThickFunction<Context> {

    // the raw function pointer
    var ptr: UnsafeRawPointer

    // the context of the function value – can be nil to indicate
    // that the function has no context.
    var context: UnsafePointer<Context>?
}

// The structure of a %swift.refcounted.
struct RefCounted {

    // pointer to the metadata of the object
    var type: UnsafeRawPointer

    // the reference counting bits.
    var refCountingA: UInt32
    var refCountingB: UInt32
}

// The structure of a %swift.refcounted, with a value tacked onto the end.
// This is what captured values get wrapped in (on the heap).
struct Box<T> {
    var ref: RefCounted
    var value: T
}

事实上,我们可以通过运行以下命令来亲自验证这一点:

In fact, we can actually verify this for ourselves by running the following:

// this wrapper is necessary so that the function doesn't get put through a reabstraction
// thunk when getting typed as a generic type T (such as with .initialize(to:))
struct VoidVoidFunction {
    var f: () -> Void
}

func makeClosure() -> () -> Void {
    var i = 5
    return { i += 2 }
}

let f = VoidVoidFunction(f: makeClosure())

let ptr = UnsafeMutablePointer<VoidVoidFunction>.allocate(capacity: 1)
ptr.initialize(to: f)

let ctx = ptr.withMemoryRebound(to: ThickFunction<Box<Int>>.self, capacity: 1) { 
    $0.pointee.context! // force unwrap as we know the function has a context object.
}

print(ctx.pointee) 
// Box<Int>(ref:
//     RefCounted(type: 0x00000001002b86d0, refCountingA: 2, refCountingB: 2),
//     value: 5
// )

f.f() // call the closure – increment the captured value.

print(ctx.pointee)
// Box<Int>(ref:
//     RefCounted(type: 0x00000001002b86d0, refCountingA: 2, refCountingB: 2),
//     value: 7
// )

ptr.deinitialize()
ptr.deallocate(capacity: 1)

我们可以看到,通过在打印出上下文对象的值之间调用该函数,我们可以观察到捕获变量i的值的变化.

We can see that by calling the function between printing out the value of the context object, we can observe the changing in value of the captured variable i.

对于多个捕获的值,我们需要额外的间接性,因为这些框不能直接存储为给定函数的上下文,并且可能被其他闭包捕获.这是通过将指向框的指针添加到 %swift.refcounted 的末尾来完成的.

For multiple captured values, we need extra indirection, as the boxes cannot be stored directly as the given function's context, and may be captured by other closures. This is done by adding pointers to the boxes to the end of a %swift.refcounted.

例如:

struct TwoCaptureContext<T, U> {

    // reference counting header
    var ref: RefCounted

    // pointers to boxes with captured values...
    var first: UnsafePointer<Box<T>>
    var second: UnsafePointer<Box<U>>
}

func makeClosure() -> () -> Void {
    var i = 5
    var j = "foo"
    return { i += 2; j += "b" }
}

let f = VoidVoidFunction(f: makeClosure())

let ptr = UnsafeMutablePointer<VoidVoidFunction>.allocate(capacity: 1)
ptr.initialize(to: f)

let ctx = ptr.withMemoryRebound(to:
                  ThickFunction<TwoCaptureContext<Int, String>>.self, capacity: 1) {
    $0.pointee.context!.pointee
}

print(ctx.first.pointee.value, ctx.second.pointee.value) // 5 foo

f.f() // call the closure – mutate the captured values.

print(ctx.first.pointee.value, ctx.second.pointee.value) // 7 foob

ptr.deinitialize()
ptr.deallocate(capacity: 1)

<小时>

将函数传递给泛型类型的参数

您会注意到,在前面的示例中,我们为函数值使用了 VoidVoidFunction 包装器.这是因为否则,当传入一个泛型类型的参数时(比如 UnsafeMutablePointerinitialize(to:) 方法),Swift 会通过一些reabstraction thunks 以将其调用约定统一为通过引用传递参数和返回值而不是通过值传递的约定(相关 IR).


Passing functions into parameters of generic type

You'll note that in the previous examples, we used a VoidVoidFunction wrapper for our function values. This is because otherwise, when being passed into a parameter of generic type (such as UnsafeMutablePointer's initialize(to:) method), Swift will put a function value through some reabstraction thunks in order to unify its calling convention to one where the arguments and return are passed by reference, rather than value (the relevant IR).

但是现在我们的函数值有一个指向 thunk 的指针,而不是我们想要调用的实际函数.那么 thunk 如何知道调用哪个函数呢?答案很简单——Swift 将我们希望 thunk 调用的函数放在 context 本身中,因此看起来像这样:

But now our function value has a pointer to the thunk, rather than the actual function we want to call. So how does the thunk know which function to call? The answer is simple – Swift puts the function that we want to the thunk to call in the context itself, which will therefore look like this:

// the context object for a reabstraction thunk – contains an actual function to call.
struct ReabstractionThunkContext<Context> {

    // the standard reference counting header
    var ref: RefCounted

    // the thick function value for the thunk to call
    var function: ThickFunction<Context>
}

我们经过的第一个 thunk 有 3 个参数:

The first thunk that we go through has 3 parameters:

  1. 指向应该存储返回值的位置的指针
  2. 指向函数参数所在位置的指针
  3. 包含要调用的实际厚函数值的上下文对象(如上所示)

第一个 thunk 只是从上下文中提取函数值,然后调用 第二个 thunk,带有 4 个参数:

This first thunk just extracts the function value from the context, and then calls a second thunk, with 4 parameters:

  1. 指向应该存储返回值的位置的指针
  2. 指向函数参数所在位置的指针
  3. 要调用的原始函数指针
  4. 指向要调用的函数上下文的指针

这个 thunk 现在从参数指针中检索参数(如果有的话),然后用这些参数及其上下文调用给定的函数指针.然后它将返回值(如果有)存储在返回指针的地址处.

This thunk now retrieves the arguments (if any) from the argument pointer, then calls the given function pointer with these arguments, along with its context. It then stores the return value (if any) at the address of the return pointer.

和前面的例子一样,我们可以这样测试:

Like in the previous examples, we can test this like so:

func makeClosure() -> () -> Void {
    var i = 5
    return { i += 2 }
}

func printSingleCapturedValue<T>(t: T) {

    let ptr = UnsafeMutablePointer<T>.allocate(capacity: 1)
    ptr.initialize(to: t)

    let ctx = ptr.withMemoryRebound(to:
        ThickFunction<ReabstractionThunkContext<Box<Int>>>.self, capacity: 1) {
        // get the context from the thunk function value, which we can
        // then get the actual function value from, and therefore the actual
        // context object.
        $0.pointee.context!.pointee.function.context!
    }

    // print out captured value in the context object
    print(ctx.pointee.value)

    ptr.deinitialize()
    ptr.deallocate(capacity: 1)
}

let closure = makeClosure()

printSingleCapturedValue(t: closure) // 5
closure()
printSingleCapturedValue(t: closure) // 7

<小时>

转义与非转义捕获

当编译器可以确定对给定局部变量的捕获不会超出它声明的函数的生命周期时,它可以通过将该变量的值从堆分配框提升到堆栈来优化(这是一个保证优化,甚至发生在 -Onone 中).然后,函数的上下文对象只需要在堆栈上存储一个指向给定捕获值的指针,因为它保证在函数退出后不再需要.


Escaping vs. non-escaping capture

When the compiler can determine that the capture of a given local variable doesn't escape the lifetime of the function it's declared in, it can optimise by promoting the value of that variable from the heap-allocated box to the stack (this is a guaranteed optimisation, and occurs in even -Onone). Then, the function's context object need only store a pointer to the given captured value on the stack, as it is guaranteed not to be needed after the function exits.

因此,当已知捕获变量的闭包不会逃脱函数的生命周期时,可以这样做.

This can therefore be done when the closure(s) capturing the variable are known not to escape the lifetime of the function.

通常,转义闭包是:

  • 存储在非局部变量中(包括从函数返回).
  • 被另一个转义闭包捕获.
  • 作为参数传递给函数,其中该参数标记为 @escaping,或者不是函数类型(注意这包括复合类型,例如 optional 函数类型).
  • Is stored in a non-local variable (including being returned from the function).
  • Is captured by another escaping closure.
  • Is passed as an argument to a function where that parameter is either marked as @escaping, or is not of function type (note this includes composite types, such as optional function types).

因此,以下是可以认为对给定变量的捕获不会逃避函数生命周期的示例:

So, the following are examples where the capture of a given variable can be considered not to escape the lifetime of the function:

// the parameter is non-escaping, as is of function type and is not marked @escaping.
func nonEscaping(_ f: () -> Void) {
    f()
}

func bar() -> String {

    var str = ""

    // c doesn't escape the lifetime of bar().
    let c = {
        str += "c called; "
    }

    c();

    // immediately-evaluated closure obviously doesn't escape.
    { str += "immediately-evaluated closure called; " }()

    // closure passed to non-escaping function parameter, so doesn't escape.
    nonEscaping {
        str += "closure passed to non-escaping parameter called."
    }

    return str
}

在这个例子中,因为 str 只被已知不会逃脱函数 bar() 生命周期的闭包捕获,编译器可以通过存储优化str 在栈上的值,上下文对象只存储一个指向它的指针(相关 IR).

In this example, because str is only ever captured by closures that are known not to escape the lifetime of the function bar(), the compiler can optimise by storing the value of str on the stack, with the context objects storing only a pointer to it (the relevant IR).

因此,每个闭包1的上下文对象看起来像Box>,并带有指向堆栈中字符串值的指针.尽管不幸的是,以类似于薛定谔的方式,尝试通过分配和重新绑定指针(像以前一样)来观察这一点会触发编译器将给定的闭包视为转义 - 所以我们再次查看 Box<String> 用于上下文.

So, the context objects for each of the closures1 will look like Box<UnsafePointer<String>>, with pointers to the string value on the stack. Although unfortunately, in a Schrödinger-like manner, attempting to observe this by allocating and re-binding a pointer (like before) triggers the compiler to treat the given closure as escaping – so we're once again looking at a Box<String> for the context.

为了处理保存指向捕获值的指针而不是将值保存在它们自己的堆分配框中的上下文对象之间的差异 - Swift 创建了闭包的专门实现,这些闭包带有指向捕获值的指针作为参数.

In order to deal with the disparity between context objects that hold pointer(s) to the captured values rather than holding the values in their own heap-allocated boxes – Swift creates specialised implementations of the closures that take pointers to the captured values as arguments.

然后,为每个闭包创建一个 thunk,它只接收给定的上下文对象,从中提取指向捕获值的指针,并将其传递给闭包的专门实现.现在,我们可以有一个指向这个 thunk 的指针以及我们的上下文对象作为厚函数值.

Then, a thunk is created for each closure that simply takes in a given context object, extracts the pointer(s) to the captured values from it, and passes this onto the specialised implementation of the closure. Now, we can just have a pointer to this thunk along with our context object as the thick function value.

对于不转义的多个捕获值,额外的指针被简单地添加到框的末尾,即

For multiple captured values that don't escape, the additional pointers are simply added onto the end of the box, i.e

struct TwoNonEscapingCaptureContext<T, U> {

    // reference counting header
    var ref: RefCounted

    // pointers to captured values (on the stack)...
    var first: UnsafePointer<T>
    var second: UnsafePointer<U>
}

这种将捕获的值从堆提升到堆栈的优化在这种情况下可以特别有益,因为我们不再需要为每个值分配单独的框——例如之前的案例.

This optimisation of promoting the captured values from the heap to the stack can be especially beneficial in this case, as we're no longer having to allocate separate boxes for each value – such as was the case previously.

此外,值得注意的是,许多具有非转义闭包捕获的情况可以在带有内联的 -O 构建中更加积极地优化,这可能导致上下文对象被完全优化掉.

Furthermore it's worth noting that lots of cases with non-escaping closure capture can be optimised much more aggressively in -O builds with inlining, which can result in context objects being optimised away entirely.

1. 立即评估的闭包实际上不使用上下文对象,指向捕获值的指针只是在调用时直接传递给它.

这篇关于apple/swift 中的 Swift 函数对象包装器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆