Rust编译器如何知道`Cell`具有内部可变性? [英] How does the Rust compiler know `Cell` has internal mutability?
问题描述
考虑以下代码(游乐场版本):
use std::cell::Cell;
struct Foo(u32);
#[derive(Clone, Copy)]
struct FooRef<'a>(&'a Foo);
// the body of these functions don't matter
fn testa<'a>(x: &FooRef<'a>, y: &'a Foo) { x; }
fn testa_mut<'a>(x: &mut FooRef<'a>, y: &'a Foo) { *x = FooRef(y); }
fn testb<'a>(x: &Cell<FooRef<'a>>, y: &'a Foo) { x.set(FooRef(y)); }
fn main() {
let u1 = Foo(3);
let u2 = Foo(5);
let mut a = FooRef(&u1);
let b = Cell::new(FooRef(&u1));
// try one of the following 3 statements
testa(&a, &u2); // allow move at (1)
testa_mut(&mut a, &u2); // deny move -- fine!
testb(&b, &u2); // deny move -- but how does rustc know?
u2; // (1) move out
// ... do something with a or b
}
我很好奇rustc
如何知道Cell
具有内部可变性,并且可能会保留其他自变量的引用.
I'm curious how rustc
knows that Cell
has interior mutability and may hold on to a reference of the other argument.
如果我从头开始创建另一个数据结构,类似于具有内部可变性的Cell
,我该如何告诉rustc
呢?
If I create another data structure from scratch, similar to Cell
which also has interior mutability, how do I tell rustc
that?
推荐答案
使用Cell
进行编译(忽略u2
)并进行变异的原因是Cell
的整个API使用&
指针:
The reason the code with Cell
compiles (ignoring the u2
) and mutates is Cell
's whole API takes &
pointers:
impl<T> Cell<T> where T: Copy {
fn new(value: T) -> Cell<T> { ... }
fn get(&self) -> T { ... }
fn set(&self, value: T) { ... }
}
精心编写,允许共享时发生变异,即内部变异.这使它可以在&
指针后面公开这些变异方法.通常,常规突变需要&mut
指针(具有相关的非混淆限制),因为通常唯一地访问值是确保对其进行突变将是安全的唯一方法.
It is carefully written to allow mutation while shared, i.e. interior mutability. This allows it to expose these mutating methods behind a &
pointer. Conventional mutation requires a &mut
pointer (with its associated non-aliasing restrictions) because having unique access to a value is the only way to ensure that mutating it will be safe, in general.
因此,创建允许共享时发生突变的类型的方法是确保其用于突变的API使用&
指针而不是&mut
.一般来说,应该通过使类型包含诸如Cell
之类的预写类型来完成此操作,即,将它们用作构建基块.
So, the way to create types that allow mutation while shared is to ensure that their API for mutation uses &
pointers instead of &mut
. Generally speaking this should be done by having the type contain pre-written types like Cell
, i.e. use them as building blocks.
以后使用u2
失败的原因是更长的故事...
The reason later use of u2
fails is a longer story...
在较低级别上,共享值(例如,具有多个&
指针)时更改值是不确定的行为,除非值包含在
At a lower level, mutating a value while it is shared (e.g. has multiple &
pointers to it) is undefined behaviour, except for when the value is contained in an UnsafeCell
. This is the very lowest level of interior mutability, designed to be used as a building block for building other abstractions.
允许安全内部可变性的类型,例如Cell
,RefCell
(对于顺序代码),Atomic*
s,Mutex
和RwLock
(对于并发代码),都在内部使用UnsafeCell
并强加围绕它的一些限制以确保它是安全的.例如,Cell
的定义是:
Types that allow safe interior mutability, like Cell
, RefCell
(for sequential code), the Atomic*
s, Mutex
and RwLock
(for concurrent code) all use UnsafeCell
internally and impose some restrictions around it to ensure that it is safe. For example, the definition of Cell
is:
pub struct Cell<T> {
value: UnsafeCell<T>,
}
Cell
通过仔细限制它提供的API来确保突变是安全的:上面代码中的T: Copy
是关键.
Cell
ensures that mutations are safe by carefully restricting the API it offers: the T: Copy
in the code above is key.
(如果您希望编写具有内部可变性的低级类型,则只需确保将共享时发生变异的内容包含在UnsafeCell
中.但是,我建议您不要这样做:Rust有几种用于内部可变性的现有工具(我在上文中提到过),在Rust的别名和突变规则中经过仔细审查以确保安全和正确;违反规则是未定义的行为,很容易导致程序错误编译.)
(If you wish to write your own low-level type with interior mutability, you just need to ensure that the things that are mutated while being shared are contained in an UnsafeCell
. However, I recommended not doing this: Rust has several existing tools (the ones I mentioned above) for interior mutability that are carefully vetted to be safe and correct within Rust's aliasing and mutation rules; breaking the rules is undefined behaviour and can easily result in miscompiled programs.)
无论如何,使编译器理解在单元格情况下借用了&u2
的关键是生存期的变化.通常,当您将事物传递给函数时,编译器会缩短生命周期,这使事物工作起来非常好,例如您可以将字符串文字(&'static str
)传递给期望&'a str
的函数,因为长的'static
生存期缩短为'a
. testa
会发生这种情况:testa(&a, &u2)
调用正在将引用的生存期从最长可能缩短(main
的整个主体)缩短到该函数调用.编译器可以自由地执行此操作,因为普通引用在其生命周期内是variant 1 ,即它可以改变它们.
Anyway, the key that makes the compiler understand that the &u2
is borrowed for the cell case is variance of lifetimes. Typically, the compiler will shorten lifetimes when you pass things to functions, which makes things work great, e.g. you can pass a string literal (&'static str
) to a function expecting &'a str
, because the long 'static
lifetime is shortened to 'a
. This is happening for testa
: the testa(&a, &u2)
call is shortening the lifetimes of the references from the longest they could possibly be (the whole of the body of main
) to just that function call. The compiler is free to do this because normal references are variant1 in their lifetimes, i.e. it can vary them.
但是,对于testa_mut
,&mut FooRef<'a>
阻止编译器缩短生存期(在技术术语上,&mut T
是"T
中的不变"),正是因为可能发生类似testa_mut
的事情.在这种情况下,编译器会看到&mut FooRef<'a>
并了解到'a
的生存期根本无法缩短,因此在调用testa_mut(&mut a, &u2)
时,它必须采用u2
值的真实生存期(整个功能),因此导致该区域借用u2
.
However, for testa_mut
, the &mut FooRef<'a>
stops the compiler being able to shorten that lifetime (in technical terms &mut T
is "invariant in T
"), exactly because something like testa_mut
can happen. In this case, the compiler sees the &mut FooRef<'a>
and understand that the 'a
lifetime can't be shorted at all, and so in the call testa_mut(&mut a, &u2)
it has to take the true lifetime of the u2
value (the whole function) and hence causes u2
to be borrowed for that region.
因此,回到内部可变性:UnsafeCell<T>
不仅告诉编译器事物在别名时可能会发生突变(因此抑制了一些未定义的优化),而且在T
中也是不变的,即它就此生命周期/借用分析而言,其行为类似于&mut T
,恰恰是因为它允许类似testb
的代码.
So, coming back to interior mutability: UnsafeCell<T>
not only tells the compiler that a thing may be mutated while aliased (and hence inhibits some optimisations that would be undefined), it is also invariant in T
, i.e. it acts like a &mut T
for the purposes of this lifetime/borrowing analysis, exactly because it allows code like testb
.
编译器自动推断出这种差异;当UnsafeCell
或&mut
中的某个类型参数/生存期包含在该类型的某个位置(例如Cell<FooRef<'a>>
中的FooRef
)时,它将变得不变.
The compiler infers this variance automatically; it becomes invariant when some type parameter/lifetime is contained in UnsafeCell
or &mut
somewhere in the type (like FooRef
in Cell<FooRef<'a>>
).
Rustonomicon对此进行了讨论和其他类似的详细考虑
The Rustonomicon talks about this and other detailed considerations like it.
1 严格来说,类型系统术语中有四个级别的方差:双方差,协方差,逆方差和不变性.我相信Rust确实只有不变性和协变性(存在一些矛盾性,但是它引起了问题并且已被删除(在删除过程中).当我说变体"时,它的意思是协变".有关更多详细信息,请参见上面的Rustonomicon链接.
1 Strictly speaking, there's four levels of variance in type system jargon: bivariance, covariance, contravariance and invariance. I believe Rust really only has invariance and covariance (there is some contravariance, but it caused problems and is removed/in the process of being removed). When I say "variant" it really means "covariant". See the Rustonomicon link above for more detail.
这篇关于Rust编译器如何知道`Cell`具有内部可变性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!