Rust 是否将 trait 对象函数调用去虚拟化? [英] Does Rust devirtualize trait object function calls?

查看:39
本文介绍了Rust 是否将 trait 对象函数调用去虚拟化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

去虚拟化:将虚拟/多态/间接函数调用更改为静态函数调用,以确保更改正确无误——来源:我自己

devirtualize: to change a virtual/polymorphic/indirect function call into a static function call due to some guarantee that the change is correct -- source: myself

给定一个简单的特征对象,&dyn ToString,用静态已知类型String创建:

Given a simple trait object, &dyn ToString, created with a statically known type, String:

fn main() {
    let name: &dyn ToString = &String::from("Steve");
    println!("{}", name.to_string());
}

.to_string() 的调用是否直接使用::to_string()?还是仅通过 trait 的 vtable 间接?如果是间接的,是否可以将这个调用去虚拟化?或者是否存在阻碍这种优化的基本因素?

Does the call to .to_string() use <String as ToString>::to_string() directly? Or only indirectly via the trait's vtable? If indirectly, would it be possible to devirtualize this call? Or is there something fundamental that hinders this optimization?

这个问题的激励代码要复杂得多;它使用异步特征函数,我想知道在某些情况下是否可以优化返回 Box.

The motivating code for this question is much more complicated; it uses async trait functions and I'm wondering if returning a Box<dyn Future> can be optimized in some cases.

推荐答案

Rust 是否将 trait 对象函数调用去虚拟化?

Does Rust devirtualize trait object function calls?

没有

Rust 是一种语言,它不任何事情;它只规定语义.

Rust is a language, it doesn't do anything; it only prescribes semantics.

在这种特定情况下,Rust 语言没有规定去虚拟化,因此允许实现这样做.

In this specific case, the Rust language doesn't prescribe devirtualization, so an implementation is permitted to do it.

目前,唯一稳定的实现是 rustc,它带有 LLVM 后端——不过如果你喜欢冒险,也可以使用起重机升降机后端.

At the moment, the only stable implementation is rustc, with the LLVM backend -- though you can use the cranelift backend if you feel adventurous.

您可以在 游乐场 并选择显示 LLVM IR";而不是运行",以及释放"您应该能够检查是否没有虚拟调用,而不是调试".

You can test your code for this implementation on the playground and select "Show LLVM IR" instead of "Run", as well as "Release" instead of "Debug", you should be able to check that there is no virtual call.

修改后的代码版本将强制转换为 trait + 动态调用以使其更容易:

A revised version of the code isolates the cast to trait + dynamic call to make it easier:

#[inline(never)]
fn to_string(s: &String) -> String {
    let name: &dyn ToString = s;
    name.to_string()
}

fn main() {
    let name = String::from("Steve");
    let name = to_string(&name);
    println!("{}", name);
}

游乐场上运行时 产生其他结果:

; playground::to_string
; Function Attrs: noinline nonlazybind uwtable
define internal fastcc void @_ZN10playground9to_string17h4a25abbd46fc29d4E(%"std::string::String"* noalias nocapture dereferenceable(24) %0, %"std::string::String"* noalias readonly align 8 dereferenceable(24) %s) unnamed_addr #0 {
start:
; call <alloc::string::String as core::clone::Clone>::clone
  tail call void @"_ZN60_$LT$alloc..string..String$u20$as$u20$core..clone..Clone$GT$5clone17h1e3037d7443348baE"(%"std::string::String"* noalias nocapture nonnull sret dereferenceable(24) %0, %"std::string::String"* noalias nonnull readonly align 8 dereferenceable(24) %s)
  ret void
}

在这里您可以清楚地看到对 ToString::to_string 的调用已被对 ::clone 的简单调用所取代;一个去虚拟化的调用.

Where you can clearly see that the call to ToString::to_string has been replaced by a simple call to <String as Clone>::clone; a devirtualized call.

这个问题的激励代码要复杂得多;它使用异步特征函数,我想知道在某些情况下是否可以优化返回 Box.

The motivating code for this question is much more complicated; it uses async trait functions and I'm wondering if returning a Box<dyn Future> can be optimized in some cases.

很遗憾,您无法从上述示例中得出任何结论.

Unfortunately, you cannot draw any conclusion from the above example.

优化很挑剔.从本质上讲,大多数优化类似于使用正则表达式的模式匹配+替换:对人类来说无害的差异可能会完全抛弃模式匹配并阻止优化应用.

Optimizations are finicky. In essence, most optimizations are akin to pattern-matching+replacing using regexes: differences that to human look benign may completely throw off the pattern-matching and prevent the optimization to apply.

确定优化适用于您的情况的唯一方法(如果重要的话)是检查发出的程序集.

The only way to be certain that the optimization is applied in your case, if it matters, is to inspect the emitted assembly.

但是,实际上,在这种情况下,与虚拟调用相比,我更担心内存分配.虚拟调用大约有 5ns 的开销——尽管它确实抑制了许多优化——而内存分配(以及最终的释放)通常需要 20ns 到 30ns.

But, really, in this case, I'd be more worried about the memory allocation than about the virtual call. A virtual call is about 5ns of overhead -- though it does inhibit a number of optimization -- whereas a memory allocation (and the eventual deallocation) routinely cost 20ns - 30ns.

这篇关于Rust 是否将 trait 对象函数调用去虚拟化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆