为什么 Rust 借用检查器拒绝此代码? [英] Why does Rust borrow checker reject this code?

查看:33
本文介绍了为什么 Rust 借用检查器拒绝此代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从借用检查器收到 Rust 编译错误,我不明白为什么.可能有些关于生命的事情我不完全理解.

我将其归结为一个简短的代码示例.主要是,我想这样做:

fn main() {让 codeToScan = "40 + 2";让 mut 扫描仪 = Scanner::new(codeToScan);让 first_token =scanner.consume_till(|c| { !c.is_digit()});println!("第一个标记是:{}", first_token);//scanner.consume_till(|c| { c.is_whitespace ()});//为什么这条线会失败?}

尝试第二次调用 scanner.consume_till 给我这个错误:

example.rs:64:5: 64:12 错误:不能一次多次借用 `scanner` 作为可变的example.rs:64 scanr.consume_till(|c| { c.is_whitespace ()});//为什么这条线会失败?^~~~~~~example.rs:62:23: 62:30 注意:之前的`scanner`借用发生在这里;可变借用可防止扫描仪"的后续移动、借用或修改,直到借用结束example.rs:62 let first_token = scanr.consume_till(|c| { !c.is_digit ()});^~~~~~~example.rs:65:2: 65:2 注意:之前的借阅到此结束示例.rs:59 fn main() {...示例.rs:65 }

基本上,我已经制作了类似我自己的迭代器的东西,而next"方法的等价物采用 &mut self.因此,我不能在同一范围内多次使用该方法.

然而,Rust std 库有一个迭代器,可以在同一个作用域内多次使用,并且它还需要一个 &mut self 参数.

let test = "这是一个字符串";让 mut 迭代器 = test.chars();迭代器.next();迭代器.next();//这是完全合法的

那么为什么 Rust std 库代码可以编译,而我的却不能?(我确信生命周期注释是它的根源,但我对生命周期的理解并没有让我期待一个问题).

这是我的完整代码(仅 60 行,针对此问题进行了缩短):

 使用 std::str::{Chars};使用 std::iter::{Enumerate};#[推导(显示)]struct ConsumeResult<'lt>{值: &'lt str,起始索引:单位,结束索引:单位,}struct Scanner<'{代码:&'lt str,char_iterator:枚举,isEof:布尔,}实施<'lt>扫描仪{fn new<'lt>(代码:&'lt str)->扫描仪{扫描器{code: code, char_iterator: code.chars().enumerate(), isEof: false}}fn assert_not_eof<'lt>(&'lt self) {if self.isEof {fail!("Scanner is at EOF.");}}fn next(&mut self) ->选项<(uint, char)>{self.assert_not_eof();让结果 = self.char_iterator.next();if result == None { self.isEof = true;}返回结果;}fn consumer_till<'lt>(&'lt mut self, quit: |char| -> bool) ->ConsumeResult<'lt>{self.assert_not_eof();let mut startIndex: Option= 无;let mut endIndex: Option= 无;环形 {让 should_quit = 匹配 self.next() {无 =>{endIndex = Some(endIndex.unwrap() + 1);真的},有些((i, ch)) =>{if startIndex == None { startIndex = Some(i);}endIndex = Some(i);退出 (ch)}};如果 should_quit {return ConsumeResult{ value: self.code.slice(startIndex.unwrap(), endIndex.unwrap()),startIndex:startIndex.unwrap(), endIndex:endIndex.unwrap() };}}}}fn 主(){让 codeToScan = "40 + 2";让 mut 扫描仪 = Scanner::new(codeToScan);让 first_token =scanner.consume_till(|c| { !c.is_digit()});println!("第一个标记是:{}", first_token);//scanner.consume_till(|c| { c.is_whitespace ()});//为什么这条线会失败?}

解决方案

这里有一个更简单的例子:

struct Scanner<'a>{s: &'a str}impl'a>扫描仪a{fn step_by_3_bytes<'a>(&'a mut self) ->&'a str {让 return_value = self.s.slice_to(3);self.s = self.s.slice_from(3);返回值}}fn 主(){让 mut scan = Scanner { s: "123456" };让 a = scan.step_by_3_bytes();println!("{}", a);让 b = scan.step_by_3_bytes();println!("{}", b);}

如果 你编译,你会得到类似问题中的代码的错误:

:19:13: 19:17 错误:不能一次多次借用 `scan` 作为可变的<anon>:19 让 b = scan.step_by_3_bytes();^~~~<anon>:16:13: 16:17 注意:之前的 `scan` 借用发生在这里;可变借用可防止scan"的后续移动、借用或修改,直到借用结束<anon>:16 let a = scan.step_by_3_bytes();^~~~<anon>:21:2: 21:2 注意:之前的借阅到此结束<anon>:13 fn main() {...<匿名>:21 }^

现在,要做的第一件事是避免遮蔽生命周期:也就是说,这段代码有两个生命周期,称为 'a'a 中的所有 'a>step_by_3_bytes 指的是在那里声明的 'a,它们中没有一个实际指的是 Scanner<'a> 中的 'a.我会重命名内部的,以清楚地了解发生了什么

impl<'a>扫描仪a{fn step_by_3_bytes<'b>(&'b mut self) ->&'b str {

这里的问题是 'bself 对象与 str 返回值连接起来.当从外部查看 step_by_3_bytes 的定义时,编译器必须假设调用 step_by_3_bytes 可以进行任意修改,包括使之前的返回值无效(这就是编译器的工作方式),类型检查纯粹基于被调用事物的类型签名,没有内省).也就是说,它可以被定义为

struct Scanner<'a>{s: &'a str,其他:字符串,计数:单位}impl'a>扫描仪a{fn step_by_3_bytes<'b>(&'b mut self) ->&'b str {self.other.push_str(self.s);//将引用返回到我们拥有的数据中self.other.as_slice()}}

现在,每次调用 step_by_3_bytes 都会开始修改先前返回值来自的对象.例如.它可能导致 String 重新分配并因此在内存中移动,留下任何其他 &str 返回值作为悬空指针.Rust 通过跟踪这些引用并在可能导致此类灾难性事件的情况下禁止突变来防止这种情况发生.回到我们的实际代码:编译器只是通过查看 step_by_3_bytes/consume_till 的类型签名来对 main 进行类型检查,因此它只能假设最坏的情况(即我刚刚给出的例子).

<小时>

我们如何解决这个问题?

让我们退后一步:好像我们刚刚开始并且不知道我们想要返回值的生命周期,所以我们只是让它们匿名(实际上不是有效的 Rust):

impl<'a>扫描仪a{fn step_by_3_bytes<'b>(&'_ mut self) ->&'_ str {

现在,我们可以问一个有趣的问题:我们想要哪些生命周期?

注释最长的有效生命周期几乎总是最好的,而且我们知道我们的返回值对于 'a 有效(因为它直接来自 s 字段,并且&str'a 有效).也就是说,

impl<'a>扫描仪a{fn step_by_3_bytes<'b>(&'_ mut self) ->&'a str {

对于其他 '_,我们实际上并不关心:作为 API 设计者,我们没有任何特别的愿望或需要将 self 借用与任何其他引用(与返回值不同,我们想要/需要表达它来自哪个内存).所以,我们不妨把它关掉

impl<'a>扫描仪a{fn step_by_3_bytes<'b>(&mut self) ->&'a str {

'b 未被使用,所以它可以被杀死,留给我们

impl<'a>扫描仪a{fn step_by_3_bytes(&mut self) ->&'a str {

这表示 Scanner 指的是一些至少对 'a 有效的内存,然后将引用返回到该内存中.self 对象本质上只是用于操作这些视图的代理:一旦获得它返回的引用,就可以丢弃 Scanner(或调用更多方法).

总而言之,完整的工作代码

struct Scanner<'a>{s: &'a str}impl'a>扫描仪a{fn step_by_3_bytes(&mut self) ->&'a str {让 return_value = self.s.slice_to(3);self.s = self.s.slice_from(3);返回值}}fn 主(){让 mut scan = Scanner { s: "123456" };让 a = scan.step_by_3_bytes();println!("{}", a);让 b = scan.step_by_3_bytes();println!("{}", b);}

将此更改应用到您的代码只是调整consume_till的定义.

fn consumer_till(&mut self, quit: |char| -> bool) ->ConsumeResult<'lt>{

<小时><块引用>

那么为什么 Rust std 库代码可以编译,而我的却不能?(我确信生命周期注释是它的根源,但我对生命周期的理解并没有让我期待一个问题).

这里有一个细微(但不是很大)的区别:Chars 只是返回一个 char,即返回值中没有生命周期.next 方法(本质上)具有签名:

impl<'a>字符a{fn next(&mut self) ->选项<字符>{

(它实际上在 Iterator trait impl 中,但这并不重要.)

你这里的情况和写作类似

impl<'a>字符a{fn next(&'a mut self) ->选项<字符>{

(类似不正确的生命周期链接",细节不同.)

I'm getting a Rust compile error from the borrow checker, and I don't understand why. There's probably something about lifetimes I don't fully understand.

I've boiled it down to a short code sample. In main, I want to do this:

fn main() {
    let codeToScan = "40 + 2";
    let mut scanner = Scanner::new(codeToScan);
    let first_token = scanner.consume_till(|c| { ! c.is_digit ()});
    println!("first token is: {}", first_token);
    // scanner.consume_till(|c| { c.is_whitespace ()}); // WHY DOES THIS LINE FAIL?
}

Trying to call scanner.consume_till a second time gives me this error:

example.rs:64:5: 64:12 error: cannot borrow `scanner` as mutable more than once at a time
example.rs:64     scanner.consume_till(|c| { c.is_whitespace ()}); // WHY DOES THIS LINE FAIL?
                  ^~~~~~~
example.rs:62:23: 62:30 note: previous borrow of `scanner` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `scanner` until the borrow ends
example.rs:62     let first_token = scanner.consume_till(|c| { ! c.is_digit ()});
                                    ^~~~~~~
example.rs:65:2: 65:2 note: previous borrow ends here
example.rs:59 fn main() {
...
example.rs:65 }

Basically, I've made something like my own iterator, and the equivalent to the "next" method takes &mut self. Because of that, I can't use the method more than once in the same scope.

However, the Rust std library has an iterator which can be used more than once in the same scope, and it also takes a &mut self parameter.

let test = "this is a string";
let mut iterator = test.chars();
iterator.next();
iterator.next(); // This is PERFECTLY LEGAL

So why does the Rust std library code compile, but mine doesn't? (I'm sure the lifetime annotations are at the root of it, but my understanding of lifetimes doesn't lead to me expecting a problem).

Here's my full code (only 60 lines, shortened for this question):

 use std::str::{Chars};
use std::iter::{Enumerate};

#[deriving(Show)]
struct ConsumeResult<'lt> {
     value: &'lt str,
     startIndex: uint,
     endIndex: uint,
}

struct Scanner<'lt> {
    code: &'lt str,
    char_iterator: Enumerate<Chars<'lt>>,
    isEof: bool,
}

impl<'lt> Scanner<'lt> {
    fn new<'lt>(code: &'lt str) -> Scanner<'lt> {
        Scanner{code: code, char_iterator: code.chars().enumerate(), isEof: false}
    }

    fn assert_not_eof<'lt>(&'lt self) {
        if self.isEof {fail!("Scanner is at EOF."); }
    }

    fn next(&mut self) -> Option<(uint, char)> {
        self.assert_not_eof();
        let result = self.char_iterator.next();
        if result == None { self.isEof = true; }
        return result;
    }

    fn consume_till<'lt>(&'lt mut self, quit: |char| -> bool) -> ConsumeResult<'lt> {
        self.assert_not_eof();
        let mut startIndex: Option<uint> = None;
        let mut endIndex: Option<uint> = None;

        loop {
            let should_quit = match self.next() {
                None => {
                    endIndex = Some(endIndex.unwrap() + 1);
                    true
                },
                Some((i, ch)) => {
                    if startIndex == None { startIndex = Some(i);}
                    endIndex = Some(i);
                    quit (ch)
                }
            };

            if should_quit {
                return ConsumeResult{ value: self.code.slice(startIndex.unwrap(), endIndex.unwrap()),
                                      startIndex:startIndex.unwrap(), endIndex: endIndex.unwrap() };
            }
        }
    }
}

fn main() {
    let codeToScan = "40 + 2";
    let mut scanner = Scanner::new(codeToScan);
    let first_token = scanner.consume_till(|c| { ! c.is_digit ()});
    println!("first token is: {}", first_token);
    // scanner.consume_till(|c| { c.is_whitespace ()}); // WHY DOES THIS LINE FAIL?
}

解决方案

Here's a simpler example of the same thing:

struct Scanner<'a> {
    s: &'a str
}

impl<'a> Scanner<'a> {
    fn step_by_3_bytes<'a>(&'a mut self) -> &'a str {
        let return_value = self.s.slice_to(3);
        self.s = self.s.slice_from(3);
        return_value
    }
}

fn main() {
    let mut scan = Scanner { s: "123456" };

    let a = scan.step_by_3_bytes();
    println!("{}", a);

    let b = scan.step_by_3_bytes();
    println!("{}", b);
}

If you compile that, you get errors like the code in the question:

<anon>:19:13: 19:17 error: cannot borrow `scan` as mutable more than once at a time
<anon>:19     let b = scan.step_by_3_bytes();
                      ^~~~
<anon>:16:13: 16:17 note: previous borrow of `scan` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `scan` until the borrow ends
<anon>:16     let a = scan.step_by_3_bytes();
                      ^~~~
<anon>:21:2: 21:2 note: previous borrow ends here
<anon>:13 fn main() {
...
<anon>:21 }
          ^

Now, the first thing to do is to avoid shadowing lifetimes: that is, this code has two lifetimes called 'a and all the 'as in step_by_3_bytes refer to the 'a declare there, none of them actually refer to the 'a in Scanner<'a>. I'll rename the inner one to make it crystal clear what is going on

impl<'a> Scanner<'a> {
    fn step_by_3_bytes<'b>(&'b mut self) -> &'b str {

The problem here is the 'b is connecting the self object with the str return value. The compiler has to assume that calling step_by_3_bytes can make arbitrary modifications, including invalidating previous return values, when looking at the definition of step_by_3_bytes from the outside (which is how the compiler works, type checking is purely based on type signatures of things that are called, no introspect). That is, it could be defined like

struct Scanner<'a> {
    s: &'a str,
    other: String,
    count: uint
}

impl<'a> Scanner<'a> {
    fn step_by_3_bytes<'b>(&'b mut self) -> &'b str {
        self.other.push_str(self.s);
        // return a reference into data we own
        self.other.as_slice()
    }
}

Now, each call to step_by_3_bytes starts modifying the object that previous return values came from. E.g. it could cause the String to reallocate and thus move in memory, leaving any other &str return values as dangling pointers. Rust protects against this by tracking these references and disallowing mutation if it could cause such catastrophic events. Going back to our actual code: the compiler is type checking main just by looking at the type signature of step_by_3_bytes/consume_till and so it can only assume the worst case scenario (i.e. the example I just gave).


How do we solve this?

Let's take a step back: as if we're just starting out and don't know which lifetimes we want for the return values, so we'll just leave them anonymous (not actually valid Rust):

impl<'a> Scanner<'a> {
    fn step_by_3_bytes<'b>(&'_ mut self) -> &'_ str {

Now, we get to ask the fun question: which lifetimes do we want where?

It's almost always best to annotate the longest valid lifetimes, and we know our return value lives for 'a (since it comes straight of the s field, and that &str is valid for 'a). That is,

impl<'a> Scanner<'a> {
    fn step_by_3_bytes<'b>(&'_ mut self) -> &'a str {

For the other '_, we don't actually care: as API designers, we don't have any particular desire or need to connect the self borrow with any other references (unlike the return value, where we wanted/needed to express which memory it came from). So, we might as well leave it off

impl<'a> Scanner<'a> {
    fn step_by_3_bytes<'b>(&mut self) -> &'a str {

The 'b is unused, so it can be killed, leaving us with

impl<'a> Scanner<'a> {
    fn step_by_3_bytes(&mut self) -> &'a str {

This expresses that Scanner is referring to some memory that is valid for at least 'a, and then returning references into just that memory. The self object is essentially just a proxy for manipulating those views: once you have the reference it returns, you can discard the Scanner (or call more methods).

In summary, the full, working code is

struct Scanner<'a> {
    s: &'a str
}

impl<'a> Scanner<'a> {
    fn step_by_3_bytes(&mut self) -> &'a str {
        let return_value = self.s.slice_to(3);
        self.s = self.s.slice_from(3);
        return_value
    }
}

fn main() {
    let mut scan = Scanner { s: "123456" };

    let a = scan.step_by_3_bytes();
    println!("{}", a);

    let b = scan.step_by_3_bytes();
    println!("{}", b);
}

Applying this change to your code is simply adjusting the definition of consume_till.

fn consume_till(&mut self, quit: |char| -> bool) -> ConsumeResult<'lt> {


So why does the Rust std library code compile, but mine doesn't? (I'm sure the lifetime annotations are at the root of it, but my understanding of lifetimes doesn't lead to me expecting a problem).

There's a slight (but not huge) difference here: Chars is just returning a char, i.e. no lifetimes in the return value. The next method (essentially) has signature:

impl<'a> Chars<'a> {
    fn next(&mut self) -> Option<char> {

(It's actually in an Iterator trait impl, but that's not important.)

The situation you have here is similar to writing

impl<'a> Chars<'a> {
    fn next(&'a mut self) -> Option<char> {

(Similar in terms of "incorrect linking of lifetimes", the details differ.)

这篇关于为什么 Rust 借用检查器拒绝此代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆