Rust的确切自动引用规则是什么? [英] What are Rust's exact auto-dereferencing rules?
问题描述
我正在使用Rust进行学习/实验,并且在用这种语言找到的所有优雅之处中,都有一种使我感到困惑的特质,似乎完全不合时宜。
在进行方法调用时,Rust自动取消引用指针。我进行了一些测试以确定确切的行为:
结构X {val:i32} $ b X的$ b impl std :: ops :: Deref {
type Target = i32;
fn deref(& self)-> & i32 {& self.val}
}
特征M {fn m(self); }
impl M for i32 {fn m(self){println!( i32 :: m()); }}
impl M for X {fn m(self){println!( X :: m()); }}
impl M for& X {fn m(self){println!(& X :: m()); }}
impl M for& X {fn m(self){println!(&& X :: m()); }}
impl M for&& X {fn m(self){println!(&&& X :: m()); }}
性状RefM {fn refm(& self); } i32的
impl RefM {fn refm(& self){println!( i32 :: refm()); }}
表示X的RefM {fn refm(& self){println!( X :: refm()); }}
为& X表示RefM {fn refm(& self){println!(& X :: refm()); }}对于& X,
impl RefM {fn refm(& self){println!(&& X :: refm()); }}
impl表示&&& X {fn refm(& self){println!(&& X :: refm()); }}
结构Y {val:i32}
impl std :: ops :: Deref for Y {
type Target = i32;
fn deref(& self)-> & i32 {& self.val}
}
struct Z {val:Y}
impl std :: ops :: Deref for Z {
type目标= Y;
fn deref(& self)-> & Y {& self.val}
}
#[derive(Clone,Copy)]
struct A;
表示A {fn m(self){println!( A :: m()); }}
impl M for&& A {fn m(self){println!(&&& A :: m()); }}
包含A的RefM {fn refm(& self){println!( A :: refm()); }}
为&& A实施RefM {fn refm(& self){println!(&& A :: refm()); }}
fn main(){
//我将使用@表示点运算符
的左侧(* X {val:42} ).m(); // i32 :: m(),自我== @
X {val:42} .m(); // X :: m(),自我== @
(& X {val:42})。m(); //& X :: m(),自我== @
(&& X {val:42})。m(); //&& X :: m(),自我== @
(&& X {val:42})。m(); //&& X:m(),Self == @
(&&& X {val:42})。m(); //&& X :: m(),Self == * @
(&&&& X {val:42})。m(); //&& X :: m(),自我== ** @
println!( --------------------- ----);
(* X {val:42})。refm(); // i32 :: refm(),自我== @
X {val:42} .refm(); // X :: refm(),自我== @
(& X {val:42})。refm(); // X :: refm(),自我== * @
(&& X {val:42})。refm(); //& X :: refm(),自我== * @
(&& X {val:42})。refm(); //&& X :: refm(),Self == * @
(&&& X {val:42})。refm(); //&& X :: refm(),Self == * @
(&&&& X {val:42})。refm(); //&&& X :: refm(),Self == ** @
println!( --------------------- ----);
Y {val:42} .refm(); // i32 :: refm(),自我== * @
Z {val:Y {val:42}}。refm(); // i32 :: refm(),自我== ** @
println!( -------------------------) ;
A.m(); // A :: m(),Self == @
//没有复制特征((& A).m()将是编译错误:
//无法移出借用内容
(& A).m(); // A :: m(),自我== * @
(&& A).m(); //&&& A :: m(),Self ==& @
(&&& A).m(); //&&& A :: m(),自我== @
A.refm(); // A :: refm(),自我== @
(& A).refm(); // A :: refm(),自我== * @
(&& A).refm(); // A :: refm(),Self == ** @
(&& A).refm(); //&&& A :: refm(),Self == @
}
(游乐场)
所以,似乎或多或少:
- 编译器会
- 编译器在解析使用
& self
声明的方法时(调用-通过引用):
- 首先尝试调用对
self
$的单个取消引用b $ b - 然后尝试调用
self
的确切类型 - 然后,尝试插入所需数量的解引用运算符匹配项
- 首先尝试调用对
- 使用
self
(按值调用)声明的方法类型T
的行为就像使用类型<$ c的& self
(按引用调用)声明的一样$ c>& T 和c - 首先使用内置的原始解引用尝试上述规则,如果没有匹配项,则使用<使用code> Deref 特性。
确切的自动取消引用规则是什么?谁能为这样的设计决定提供任何正式的理由?
您的伪代码非常正确。对于此示例,假设我们有一个方法调用 foo.bar()
,其中 foo:T
。我将使用完全限定的语法(FQS),对于使用哪种类型的方法进行调用,例如 A :: bar(foo)
或 A :: bar(& *** foo)
。我只是要写一堆随机的大写字母,每个字母都是任意的类型/特征,除了 T
始终是原始变量<$ c的类型。 $ c> foo 调用该方法。
算法的核心是:
- 对于每个取消引用步骤
U
(即,先设置U = T
,然后再设置U = * T
,...)
- 如果有方法
bar
接收者类型(方法中self
的类型)与U
完全匹配的地方,请使用它( a按值方法 ) - 否则,添加一个自动引用(获取
&
或& ; mut
的接收者),并且,如果某些方法的接收者与& U
匹配,请使用它( autorefd方法 )
- 如果有方法
值得注意的是,所有内容都将接收者类型视为方法的 not 特质的 Self
类型,即 impl ... for Foo {fn method(& ; self){}}
在匹配方法时考虑& Foo
,而 fn method2(& mut self )
匹配时会考虑& mut Foo
。
如果有多个特质方法在内部步骤中有效(也就是说,在1或2的每一个中只能有零个或一个有效的特质方法,但每个特有方法都可以有一个有效的方法:从1开始是第一个)固有方法优先于特质方法。如果我们在循环末尾没有找到任何匹配的内容,也是一个错误。具有递归的 Deref
实现也是错误的,这使得循环无限(它们将达到递归限制)。
在大多数情况下,这些规则似乎具有异乎寻常的含义,尽管具有写明确的FQS格式的能力在某些极端情况下以及对于宏生成代码的明智错误消息非常有用。
仅添加一个自动引用,因为
- 如果没有限制,事情就会变糟/变慢,因为每种类型都可以具有任意数量的引用
- 采用一个引用
& foo
与foo
(它是foo
本身的地址),但是花更多的时间会丢失它:&& foo
是堆栈中存储& foo
的某些临时变量的地址。
示例
假设我们有一个调用 foo.refm()
,如果 foo
具有类型:
-
X
,然后我们从U = X
,refm
的接收者类型为& ...
,因此第1步不匹配,采用自动引用可以使我们& X
,并且确实匹配(Self = X
),因此调用为RefM: :refm(& foo)
-
& X
,以开头U =& X
,它与第一步中的& self
匹配(其中Self = X
),因此调用为RefM :: refm(foo)
-
&& ;&& X
,这两个步骤都不匹配(对于&&&&& X $ c,该特性未实现$ c>或
&&&& X
),因此我们取消引用一次即可获得U =&& & X
,它匹配1(Self =&&& X
),且调用为RefM :: refm(* foo)
-
Z
,这两个步骤都不匹配,因此已取消引用一次,获得Y
不匹配,因此再次被取消引用以获得X
,它不匹配1,但是在自动引用后匹配,因此调用是RefM :: refm(& ** foo)
。 -
&& A
, 1.不匹配,也不匹配2.因为& A
(对于1)或& A没有实现特征
(用于2),因此将其引用到& A
,该匹配项与1.相匹配,且Self = A
假设我们有 foo.m()
,并且<如果 foo
具有以下类型,则code> A 不是复制
。 / p>
-
A
,然后U = A
匹配直接是自己
,因此调用是M :: m(foo)
,其中Self = A
-
& A
,则1.不匹配,也不匹配2。(均不& A
或&& A
实施此特征),因此将其取消引用为A
,但确实匹配,但M :: m(* foo)
r等价于按值计算A
从而从foo
中移出,因此出现错误。 -
&& A
,1不匹配,但是自动引用给出了&& A
匹配,因此调用为M :: m(& foo)
,其中Self =&& A
。
(此答案基于代码和相当接近(略过时)自述文件。编译器/语言的这一部分的主要作者Niko Matsakis也浏览了这个答案。)
I'm learning/experimenting with Rust, and in all the elegance that I find in this language, there is one peculiarity that baffles me and seems totally out of place.
Rust automatically dereferences pointers when making method calls. I made some tests to determine the exact behaviour:
struct X { val: i32 }
impl std::ops::Deref for X {
type Target = i32;
fn deref(&self) -> &i32 { &self.val }
}
trait M { fn m(self); }
impl M for i32 { fn m(self) { println!("i32::m()"); } }
impl M for X { fn m(self) { println!("X::m()"); } }
impl M for &X { fn m(self) { println!("&X::m()"); } }
impl M for &&X { fn m(self) { println!("&&X::m()"); } }
impl M for &&&X { fn m(self) { println!("&&&X::m()"); } }
trait RefM { fn refm(&self); }
impl RefM for i32 { fn refm(&self) { println!("i32::refm()"); } }
impl RefM for X { fn refm(&self) { println!("X::refm()"); } }
impl RefM for &X { fn refm(&self) { println!("&X::refm()"); } }
impl RefM for &&X { fn refm(&self) { println!("&&X::refm()"); } }
impl RefM for &&&X { fn refm(&self) { println!("&&&X::refm()"); } }
struct Y { val: i32 }
impl std::ops::Deref for Y {
type Target = i32;
fn deref(&self) -> &i32 { &self.val }
}
struct Z { val: Y }
impl std::ops::Deref for Z {
type Target = Y;
fn deref(&self) -> &Y { &self.val }
}
#[derive(Clone, Copy)]
struct A;
impl M for A { fn m(self) { println!("A::m()"); } }
impl M for &&&A { fn m(self) { println!("&&&A::m()"); } }
impl RefM for A { fn refm(&self) { println!("A::refm()"); } }
impl RefM for &&&A { fn refm(&self) { println!("&&&A::refm()"); } }
fn main() {
// I'll use @ to denote left side of the dot operator
(*X{val:42}).m(); // i32::m() , Self == @
X{val:42}.m(); // X::m() , Self == @
(&X{val:42}).m(); // &X::m() , Self == @
(&&X{val:42}).m(); // &&X::m() , Self == @
(&&&X{val:42}).m(); // &&&X:m() , Self == @
(&&&&X{val:42}).m(); // &&&X::m() , Self == *@
(&&&&&X{val:42}).m(); // &&&X::m() , Self == **@
println!("-------------------------");
(*X{val:42}).refm(); // i32::refm() , Self == @
X{val:42}.refm(); // X::refm() , Self == @
(&X{val:42}).refm(); // X::refm() , Self == *@
(&&X{val:42}).refm(); // &X::refm() , Self == *@
(&&&X{val:42}).refm(); // &&X::refm() , Self == *@
(&&&&X{val:42}).refm(); // &&&X::refm(), Self == *@
(&&&&&X{val:42}).refm(); // &&&X::refm(), Self == **@
println!("-------------------------");
Y{val:42}.refm(); // i32::refm() , Self == *@
Z{val:Y{val:42}}.refm(); // i32::refm() , Self == **@
println!("-------------------------");
A.m(); // A::m() , Self == @
// without the Copy trait, (&A).m() would be a compilation error:
// cannot move out of borrowed content
(&A).m(); // A::m() , Self == *@
(&&A).m(); // &&&A::m() , Self == &@
(&&&A).m(); // &&&A::m() , Self == @
A.refm(); // A::refm() , Self == @
(&A).refm(); // A::refm() , Self == *@
(&&A).refm(); // A::refm() , Self == **@
(&&&A).refm(); // &&&A::refm(), Self == @
}
So, it seems that, more or less:
- The compiler will insert as many dereference operators as necessary to invoke a method.
- The compiler, when resolving methods declared using
&self
(call-by-reference):- First tries calling for a single dereference of
self
- Then tries calling for the exact type of
self
- Then, tries inserting as many dereference operators as necessary for a match
- First tries calling for a single dereference of
- Methods declared using
self
(call-by-value) for typeT
behave as if they were declared using&self
(call-by-reference) for type&T
and called on the reference to whatever is on the left side of the dot operator. - The above rules are first tried with raw built-in dereferencing, and if there's no match, the overload with
Deref
trait is used.
What are the exact auto-dereferencing rules? Can anyone give any formal rationale for such a design decision?
Your pseudo-code is pretty much correct. For this example, suppose we had a method call foo.bar()
where foo: T
. I'm going to use the fully qualified syntax (FQS) to be unambiguous about what type the method is being called with, e.g. A::bar(foo)
or A::bar(&***foo)
. I'm just going to write a pile of random capital letters, each one is just some arbitrary type/trait, except T
is always the type of the original variable foo
that the method is called on.
The core of the algorithm is:
- For each "dereference step"
U
(that is, setU = T
and thenU = *T
, ...)- if there's a method
bar
where the receiver type (the type ofself
in the method) matchesU
exactly , use it (a "by value method") - otherwise, add one auto-ref (take
&
or&mut
of the receiver), and, if some method's receiver matches&U
, use it (an "autorefd method")
- if there's a method
Notably, everything considers the "receiver type" of the method, not the Self
type of the trait, i.e. impl ... for Foo { fn method(&self) {} }
thinks about &Foo
when matching the method, and fn method2(&mut self)
would think about &mut Foo
when matching.
It is an error if there's ever multiple trait methods valid in the inner steps (that is, there can be only be zero or one trait methods valid in each of 1. or 2., but there can be one valid for each: the one from 1 will be taken first), and inherent methods take precedence over trait ones. It's also an error if we get to the end of the loop without finding anything that matches. It is also an error to have recursive Deref
implementations, which make the loop infinite (they'll hit the "recursion limit").
These rules seem to do-what-I-mean in most circumstances, although having the ability to write the unambiguous FQS form is very useful in some edge cases, and for sensible error messages for macro-generated code.
Only one auto-reference is added because
- if there was no bound, things get bad/slow, since every type can have an arbitrary number of references taken
- taking one reference
&foo
retains a strong connection tofoo
(it is the address offoo
itself), but taking more starts to lose it:&&foo
is the address of some temporary variable on the stack that stores&foo
.
Examples
Suppose we have a call foo.refm()
, if foo
has type:
X
, then we start withU = X
,refm
has receiver type&...
, so step 1 doesn't match, taking an auto-ref gives us&X
, and this does match (withSelf = X
), so the call isRefM::refm(&foo)
&X
, starts withU = &X
, which matches&self
in the first step (withSelf = X
), and so the call isRefM::refm(foo)
&&&&&X
, this doesn't match either step (the trait isn't implemented for&&&&X
or&&&&&X
), so we dereference once to getU = &&&&X
, which matches 1 (withSelf = &&&X
) and the call isRefM::refm(*foo)
Z
, doesn't match either step so it is dereferenced once, to getY
, which also doesn't match, so it's dereferenced again, to getX
, which doesn't match 1, but does match after autorefing, so the call isRefM::refm(&**foo)
.&&A
, the 1. doesn't match and neither does 2. since the trait is not implemented for&A
(for 1) or&&A
(for 2), so it is dereferenced to&A
, which matches 1., withSelf = A
Suppose we have foo.m()
, and that A
isn't Copy
, if foo
has type:
A
, thenU = A
matchesself
directly so the call isM::m(foo)
withSelf = A
&A
, then 1. doesn't match, and neither does 2. (neither&A
nor&&A
implement the trait), so it is dereferenced toA
, which does match, butM::m(*foo)
requires takingA
by value and hence moving out offoo
, hence the error.&&A
, 1. doesn't match, but autorefing gives&&&A
, which does match, so the call isM::m(&foo)
withSelf = &&&A
.
(This answer is based on the code, and is reasonably close to the (slightly outdated) README. Niko Matsakis, the main author of this part of the compiler/language, also glanced over this answer.)
这篇关于Rust的确切自动引用规则是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!