什么时候需要绕过 Rust 的借用检查器? [英] When is it necessary to circumvent Rust's borrow checker?

查看:46
本文介绍了什么时候需要绕过 Rust 的借用检查器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在实施 Conway 的人生游戏来自学 Rust.思路是先实现单线程版本,尽量优化,多线程版本也一样.

I'm implementing Conway's game of life to teach myself Rust. The idea is to implement a single-threaded version first, optimize it as much as possible, then do the same for a multi-threaded version.

我想实现另一种数据布局,我认为它可能对缓存更友好.这个想法是将板上每个点的两个单元的状态存储在一个向量中,一个单元用于读取当前一代的状态,另一个用于写入下一代的状态,交替访问模式每个生成的计算(可以在编译时确定).

I wanted to implement an alternative data layout which I thought might be more cache-friendly. The idea is to store the status of two cells for each point on a board next to each other in memory in a vector, one cell for reading the current generation's status from and one for writing the next generation's status to, alternating the access pattern for each generation's computation (which can be determined at compile time).

基本数据结构如下:

#[repr(u8)]
pub enum CellStatus {
    DEAD,
    ALIVE,
}

/** 2 bytes */
pub struct CellRW(CellStatus, CellStatus);

pub struct TupleBoard {
    width: usize,
    height: usize,
    cells: Vec<CellRW>,
}

/** used to keep track of current pos with iterator e.g. */
pub struct BoardPos {
    x_pos: usize,
    y_pos: usize,
    offset: usize,
}

pub struct BoardEvo {
    board: TupleBoard,
}

给我带来麻烦的函数:

impl BoardEvo {
    fn evolve_step<T: RWSelector>(&mut self) {
        for (pos, cell) in self.board.iter_mut() {
            //pos: BoardPos, cell: &mut CellRW
            let read: &CellStatus = T::read(cell); //chooses the right tuple half for the current evolution step
            let write: &mut CellStatus = T::write(cell);

            let alive_count = pos.neighbours::<T>(&self.board).iter() //<- can't borrow self.board again!
                    .filter(|&&status| status == CellStatus::ALIVE)
                    .count();

            *write = CellStatus::evolve(*read, alive_count);
        }
    }
}

impl BoardPos {
    /* ... */
    pub fn neighbours<T: RWSelector>(&self, board: &BoardTuple) -> [CellStatus; 8] {
        /* ... */
    }
}

特征 RWSelector 具有用于读取和写入单元元组 (CellRW) 的静态函数.它针对两种零大小类型LR 实现,主要是为了避免必须为不同的访问模式编写不同的方法.

The trait RWSelector has static functions for reading from and writing to a cell tuple (CellRW). It is implemented for two zero-sized types L and R and is mainly a way to avoid having to write different methods for the different access patterns.

iter_mut() 方法返回一个 BoardIter 结构体,该结构体是围绕单元向量的可变切片迭代器的包装器,因此具有 &mut CellRW 作为 Item 类型.它还知道当前的 BoardPos(x 和 y 坐标,偏移量).

The iter_mut() method returns a BoardIter struct which is a wrapper around a mutable slice iterator for the cells vector and thus has &mut CellRW as Item type. It is also aware of the current BoardPos (x and y coordinates, offset).

我想我会遍历所有单元格元组,跟踪坐标,计算每个(读取)单元格的活动邻居数(我需要知道坐标/偏移量),计算单元格状态下一代并写入元组的另一半.

I thought I'd iterate over all cell tuples, keep track of the coordinates, count the number of alive neighbours (I need to know coordinates/offsets for this) for each (read) cell, compute the cell status for the next generation and write to the respective another half of the tuple.

当然,最后编译器向我展示了我设计中的致命缺陷,因为我在 iter_mut() 方法中可变地借用了 self.board 然后尝试再次不可变地借用它以获取读取单元格的所有邻居.

Of course, in the end, the compiler showed me the fatal flaw in my design, as I borrow self.board mutably in the iter_mut() method and then try to borrow it again immutably to get all the neighbours of the read cell.

到目前为止,我还没有为这个问题想出一个好的解决方案.我确实设法通过使所有引用不可变,然后使用 UnsafeCell 将写入单元的不可变引用转换为可变引用.然后,我通过 UnsafeCell 写入元组写入部分的名义上不可变的引用.然而,这并没有让我觉得这是一个合理的设计,我怀疑我在尝试并行化时可能会遇到这个问题.

I have not been able to come up with a good solution for this problem so far. I did manage to get it working by making all references immutable and then using an UnsafeCell to turn the immutable reference to the write cell into a mutable one. I then write to the nominally immutable reference to the writing part of the tuple through the UnsafeCell. However, that doesn't strike me as a sound design and I suspect I might run into issues with this when attempting to parallelize things.

有没有办法实现我在安全/惯用的 Rust 中提出的数据布局,或者这实际上是您必须使用技巧来规避 Rust 的别名/借用限制的情况吗?

Is there a way to implement the data layout I proposed in safe/idiomatic Rust or is this actually a case where you actually have to use tricks to circumvent Rust's aliasing/borrow restrictions?

此外,作为一个更广泛的问题,对于需要您规避 Rust 借用限制的问题,是否存在可识别的模式?

Also, as a broader question, is there a recognizable pattern for problems which require you to circumvent Rust's borrow restrictions?

推荐答案

什么时候需要绕过 Rust 的借用检查器?

When is it necessary to circumvent Rust's borrow checker?

在以下情况下需要:

  • 借用检查器不够先进,无法确定您的使用是否安全
  • 您不希望(或不能)以不同的模式编写代码

作为一个具体案例,编译器无法判断这是安全的:

As a concrete case, the compiler cannot tell that this is safe:

let mut array = [1, 2];
let a = &mut array[0];
let b = &mut array[1];

编译器不知道切片的 IndexMut 实现在编译的这个点做了什么(这是一个有意的设计选择).就它所知,数组总是返回完全相同的引用,而不管索引参数如何.我们可以判断这段代码是安全的,但编译器不允许.

The compiler doesn't know what the implementation of IndexMut for a slice does at this point of compilation (this is a deliberate design choice). For all it knows, arrays always return the exact same reference, regardless of the index argument. We can tell that this code is safe, but the compiler disallows it.

你可以用一种对编译器明显安全的方式重写它:

You can rewrite this in a way that is obviously safe to the compiler:

let mut array = [1, 2];
let (a, b) = array.split_at_mut(1);
let a = &mut a[0];
let b = &mut b[0];

这是怎么做到的?split_at_mut 执行运行时检查以确保它确实安全:

How is this done? split_at_mut performs a runtime check to ensure that it actually is safe:

fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) {
    let len = self.len();
    let ptr = self.as_mut_ptr();

    unsafe {
        assert!(mid <= len);

        (from_raw_parts_mut(ptr, mid),
         from_raw_parts_mut(ptr.offset(mid as isize), len - mid))
    }
}

举个例子,借用检查器还没有像它那样先进,请参阅什么是非词法生命周期?.

For an example where the borrow checker is not yet as advanced as it can be, see What are non-lexical lifetimes?.

我在 iter_mut() 方法中可变地借用 self.board 然后尝试再次不变地借用它以获取读取单元的所有邻居.

I borrow self.board mutably in the iter_mut() method and then try to borrow it again immutably to get all the neighbours of the read cell.

如果您知道引用不重叠,那么您可以选择使用不安全的代码来表达它.然而,这意味着也选择承担维护 Rust 的所有不变量和避免 未定义的行为.

If you know that the references don't overlap, then you can choose to use unsafe code to express it. However, this means you are also choosing to take on the responsibility of upholding all of Rust's invariants and avoiding undefined behavior.

好消息是,这个沉重的负担是每个 C 和 C++ 程序员必须(或至少应该)肩负的他们编写的每一行代码强>.至少在 Rust 中,你可以让编译器处理 99% 的情况.

The good news is that this heavy burden is what every C and C++ programmer has to (or at least should) have on their shoulders for every single line of code they write. At least in Rust, you can let the compiler deal with 99% of the cases.

在许多情况下,有像 CellRefCell 允许内部突变.在其他情况下,您可以重写算法以利用 Copy 类型的值.在其他情况下,您可以在更短的时间内对切片使用索引.在其他情况下,您可以使用多阶段算法.

In many cases, there's tools like Cell and RefCell to allow for interior mutation. In other cases, you can rewrite your algorithm to take advantage of a value being a Copy type. In other cases you can use an index into a slice for a shorter period. In other cases you can have a multi-phase algorithm.

如果你确实需要使用unsafe代码,那么尽量把它隐藏在一个小范围内并暴露安全接口.

If you do need to resort to unsafe code, then try your best to hide it in a small area and expose safe interfaces.

最重要的是,许多常见问题之前(多次)被问过:

Above all, many common problems have been asked about (many times) before:

这篇关于什么时候需要绕过 Rust 的借用检查器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆