Rust编译器如何知道`Cell`具有内部可变性? [英] How does the Rust compiler know `Cell` has internal mutability?

查看:95
本文介绍了Rust编译器如何知道`Cell`具有内部可变性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下代码(游乐场版本):

use std::cell::Cell;

struct Foo(u32);

#[derive(Clone, Copy)]
struct FooRef<'a>(&'a Foo);

// the body of these functions don't matter
fn testa<'a>(x: &FooRef<'a>, y: &'a Foo) { x; }
fn testa_mut<'a>(x: &mut FooRef<'a>, y: &'a Foo) { *x = FooRef(y); }
fn testb<'a>(x: &Cell<FooRef<'a>>, y: &'a Foo) { x.set(FooRef(y)); }

fn main() {
    let u1 = Foo(3);
    let u2 = Foo(5);
    let mut a = FooRef(&u1);
    let b = Cell::new(FooRef(&u1));

    // try one of the following 3 statements
    testa(&a, &u2);         // allow move at (1)
    testa_mut(&mut a, &u2); // deny move -- fine!
    testb(&b, &u2);         // deny move -- but how does rustc know?

    u2;                     // (1) move out
    // ... do something with a or b
}

我很好奇rustc如何知道Cell具有内部可变性,并且可能会保留其他自变量的引用.

I'm curious how rustc knows that Cell has interior mutability and may hold on to a reference of the other argument.

如果我从头开始创建另一个数据结构,类似于具有内部可变性的Cell,我该如何告诉rustc呢?

If I create another data structure from scratch, similar to Cell which also has interior mutability, how do I tell rustc that?

推荐答案

使用Cell进行编译(忽略u2)并进行变异的原因是Cell的整个API使用&指针:

The reason the code with Cell compiles (ignoring the u2) and mutates is Cell's whole API takes & pointers:

impl<T> Cell<T> where T: Copy {
    fn new(value: T) -> Cell<T> { ... }

    fn get(&self) -> T { ... }

    fn set(&self, value: T) { ... }
}

精心编写,允许共享时发生变异,即内部变异.这使它可以在&指针后面公开这些变异方法.通常,常规突变需要&mut指针(具有相关的非混淆限制),因为通常唯一地访问值是确保对其进行突变将是安全的唯一方法.

It is carefully written to allow mutation while shared, i.e. interior mutability. This allows it to expose these mutating methods behind a & pointer. Conventional mutation requires a &mut pointer (with its associated non-aliasing restrictions) because having unique access to a value is the only way to ensure that mutating it will be safe, in general.

因此,创建允许共享时发生突变的类型的方法是确保其用于突变的API使用&指针而不是&mut.一般来说,应该通过使类型包含诸如Cell之类的预写类型来完成此操作,即,将它们用作构建基块.

So, the way to create types that allow mutation while shared is to ensure that their API for mutation uses & pointers instead of &mut. Generally speaking this should be done by having the type contain pre-written types like Cell, i.e. use them as building blocks.

以后使用u2失败的原因是更长的故事...

The reason later use of u2 fails is a longer story...

在较低级别上,共享值(例如,具有多个&指针)时更改值是不确定的行为,除非值包含在

At a lower level, mutating a value while it is shared (e.g. has multiple & pointers to it) is undefined behaviour, except for when the value is contained in an UnsafeCell. This is the very lowest level of interior mutability, designed to be used as a building block for building other abstractions.

允许安全内部可变性的类型,例如CellRefCell(对于顺序代码),Atomic* s,MutexRwLock(对于并发代码),都在内部使用UnsafeCell并强加围绕它的一些限制以确保它是安全的.例如,Cell的定义是:

Types that allow safe interior mutability, like Cell, RefCell (for sequential code), the Atomic*s, Mutex and RwLock (for concurrent code) all use UnsafeCell internally and impose some restrictions around it to ensure that it is safe. For example, the definition of Cell is:

pub struct Cell<T> {
    value: UnsafeCell<T>,
}

Cell通过仔细限制它提供的API来确保突变是安全的:上面代码中的T: Copy是关键.

Cell ensures that mutations are safe by carefully restricting the API it offers: the T: Copy in the code above is key.

(如果您希望编写具有内部可变性的低级类型,则只需确保将共享时发生变异的内容包含在UnsafeCell中.但是,我建议您不要这样做:Rust有几种用于内部可变性的现有工具(我在上文中提到过),在Rust的别名和突变规则中经过仔细审查以确保安全和正确;违反规则是未定义的行为,很容易导致程序错误编译.)

(If you wish to write your own low-level type with interior mutability, you just need to ensure that the things that are mutated while being shared are contained in an UnsafeCell. However, I recommended not doing this: Rust has several existing tools (the ones I mentioned above) for interior mutability that are carefully vetted to be safe and correct within Rust's aliasing and mutation rules; breaking the rules is undefined behaviour and can easily result in miscompiled programs.)

无论如何,使编译器理解在单元格情况下借用了&u2的关键是生存期的变化.通常,当您将事物传递给函数时,编译器会缩短生命周期,这使事物工作起来非常好,例如您可以将字符串文字(&'static str)传递给期望&'a str的函数,因为长的'static生存期缩短为'a. testa会发生这种情况:testa(&a, &u2)调用正在将引用的生存期从最长可能缩短(main的整个主体)缩短到该函数调用.编译器可以自由地执行此操作,因为普通引用在其生命周期内是variant 1 ,即它可以改变它们.

Anyway, the key that makes the compiler understand that the &u2 is borrowed for the cell case is variance of lifetimes. Typically, the compiler will shorten lifetimes when you pass things to functions, which makes things work great, e.g. you can pass a string literal (&'static str) to a function expecting &'a str, because the long 'static lifetime is shortened to 'a. This is happening for testa: the testa(&a, &u2) call is shortening the lifetimes of the references from the longest they could possibly be (the whole of the body of main) to just that function call. The compiler is free to do this because normal references are variant1 in their lifetimes, i.e. it can vary them.

但是,对于testa_mut&mut FooRef<'a>阻止编译器缩短生存期(在技术术语上,&mut T是"T中的不变"),正是因为可能发生类似testa_mut的事情.在这种情况下,编译器会看到&mut FooRef<'a>并了解到'a的生存期根本无法缩短,因此在调用testa_mut(&mut a, &u2)时,它必须采用u2值的真实生存期(整个功能),因此导致该区域借用u2.

However, for testa_mut, the &mut FooRef<'a> stops the compiler being able to shorten that lifetime (in technical terms &mut T is "invariant in T"), exactly because something like testa_mut can happen. In this case, the compiler sees the &mut FooRef<'a> and understand that the 'a lifetime can't be shorted at all, and so in the call testa_mut(&mut a, &u2) it has to take the true lifetime of the u2 value (the whole function) and hence causes u2 to be borrowed for that region.

因此,回到内部可变性:UnsafeCell<T>不仅告诉编译器事物在别名时可能会发生突变(因此抑制了一些未定义的优化),而且在T中也是不变的,即它就此生命周期/借用分析而言,其行为类似于&mut T,恰恰是因为它允许类似testb的代码.

So, coming back to interior mutability: UnsafeCell<T> not only tells the compiler that a thing may be mutated while aliased (and hence inhibits some optimisations that would be undefined), it is also invariant in T, i.e. it acts like a &mut T for the purposes of this lifetime/borrowing analysis, exactly because it allows code like testb.

编译器自动推断出这种差异;当UnsafeCell&mut中的某个类型参数/生存期包含在该类型的某个位置(例如Cell<FooRef<'a>>中的FooRef)时,它将变得不变.

The compiler infers this variance automatically; it becomes invariant when some type parameter/lifetime is contained in UnsafeCell or &mut somewhere in the type (like FooRef in Cell<FooRef<'a>>).

Rustonomicon对此进行了讨论和其他类似的详细考虑

The Rustonomicon talks about this and other detailed considerations like it.

1 严格来说,类型系统术语中有四个级别的方差:双方差,协方差,逆方差和不变性.我相信Rust确实只有不变性和协变性(存在一些矛盾性,但是它引起了问题并且已被删除(在删除过程中).当我说变体"时,它的意思是协变".有关更多详细信息,请参见上面的Rustonomicon链接.

1 Strictly speaking, there's four levels of variance in type system jargon: bivariance, covariance, contravariance and invariance. I believe Rust really only has invariance and covariance (there is some contravariance, but it caused problems and is removed/in the process of being removed). When I say "variant" it really means "covariant". See the Rustonomicon link above for more detail.

这篇关于Rust编译器如何知道`Cell`具有内部可变性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆