同时可变访问保证不相交的大向量的任意索引 [英] Simultaneous mutable access to arbitrary indices of a large vector that are guaranteed to be disjoint

查看:19
本文介绍了同时可变访问保证不相交的大向量的任意索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个案例,多个线程必须更新存储在共享向量中的对象.但是向量很大,需要更新的元素数量比较少.

I have a case where multiple threads must update objects stored in a shared vector. However, the vector is very large, and the number of elements to update is relatively small.

在一个最小的例子中,要更新的元素集可以由一个(散列)集标识,该集包含要更新的元素的索引.因此,代码可能如下所示:

In a minimal example, the set of elements to update can be identified by a (hash-)set containing the indices of elements to update. The code could hence look as follows:

let mut big_vector_of_elements = generate_data_vector();

while has_things_to_do() {
    let indices_to_update = compute_indices();
    indices_to_update.par_iter() // Rayon parallel iteration
       .map(|index| big_vector_of_elements[index].mutate())
       .collect()?;
}

这在 Rust 中显然是不允许的:big_vector_of_elements 不能同时在多个线程中可变地借用.但是,将每个元素包装在例如 Mutex 锁中似乎没有必要:这种特定情况在没有显式同步的情况下是安全的.由于索引来自一个集合,它们保证是不同的.par_iter 中没有两次迭代触及向量的相同元素.

This is obviously disallowed in Rust: big_vector_of_elements cannot be borrowed mutably in multiple threads at the same time. However, wrapping each element in, e.g., a Mutex lock seems unnecessary: this specific case would be safe without explicit synchronization. Since the indices come from a set, they are guaranteed to be distinct. No two iterations in the par_iter touch the same element of the vector.

编写并行改变向量中元素的程序的最佳方法是什么,其中同步已经通过索引的选择来处理,但编译器不理解后者?

What would be the best way of writing a program that mutates elements in a vector in parallel, where the synchronization is already taken care of by the selection of indices, but where the compiler does not understand the latter?

近乎最优的解决方案是将 big_vector_of_elements 中的所有元素包装在一些假设的 UncontendedMutex 锁中,这将是 Mutex 的变体这在无争用的情况下快得离谱,并且在发生争用(甚至恐慌)时可能需要任意长的时间.理想情况下,对于任何 TUncontendedMutex 也应该与 T 具有相同的大小和对齐方式.

A near-optimal solution would be to wrap all elements in big_vector_of_elements in some hypothetical UncontendedMutex lock, which would be a variant of Mutex which is ridiculously fast in the uncontended case, and which may take arbitrarily long when contention occurs (or even panics). Ideally, an UncontendedMutex<T> should also be of the same size and alignment as T, for any T.

多个问题可以用使用 Rayon 的并行迭代器"、使用 chunks_mut"或使用 split_at_mut"来回答:

Multiple questions can be answered with "use Rayon's parallel iterator", "use chunks_mut", or "use split_at_mut":

这些答案在这里似乎不相关,因为这些解决方案意味着迭代整个 big_vector_of_elements,然后针对每个元素确定是否需要更改任何内容.本质上,这意味着这样的解决方案如下所示:

These answers do not seem relevant here, since those solutions imply iterating over the entire big_vector_of_elements, and then for each element figuring out whether anything needs to be changed. Essentially, this means that such a solution would look as follows:

let mut big_vector_of_elements = generate_data_vector();

while has_things_to_do() {
    let indices_to_update = compute_indices();
    for (index, mut element) in big_vector_of_elements.par_iter().enumerate() {
        if indices_to_update.contains(index) {
            element.mutate()?;
        }
    }
}

此解决方案花费的时间与 big_vector_of_elements 的大小成正比,而第一个解决方案只循环与 indices_to_update 的大小成正比的元素数量.

This solution takes time proportionate to the size of big_vector_of_elements, whereas the first solution loops only over a number of elements proportionate to the size of indices_to_update.

推荐答案

当编译器无法强制对切片元素的可变引用不是独占时,Cell 很不错.

When the compiler can't enforce that mutable references to a slice elements aren't exclusive, Cell is pretty nice.

您可以使用 Cell::from_mut,然后是 &Cell<[T]> 使用 Cell::as_slice_of_cells.所有这一切都是零成本的:它只是用来指导类型系统.

You can transform a &mut [T] into a &Cell<[T]> using Cell::from_mut, and then a &Cell<[T]> into a &[Cell<T>] using Cell::as_slice_of_cells. All of this is zero-cost: It's just there to guide the type-system.

A &[Cell<T>] 就像一个 &[mut T],如果可以这样写:可变元素.你可以用 Cells 做的事情仅限于读取或替换——你不能获得对被包装元素本身的引用,无论是否可变.Rust 也知道 Cell 不是线程安全的(它没有实现 同步).这保证了一切都是安全的,没有动态成本.

A &[Cell<T>] is like a &[mut T], if that were possible to write: A shared reference to a slice of mutable elements. What you can do with Cells is limited to read or replace — you can't get a reference, mutable or not, to the wrapped elements themselves. Rust also knows that Cell isn't thread-safe (it does not implement Sync). This guarantees that everything is safe, at no dynamic cost.

fn main() {
    use std::cell::Cell;

    let slice: &mut [i32] = &mut [1, 2, 3];
    let cell_slice: &Cell<[i32]> = Cell::from_mut(slice);
    let slice_cell: &[Cell<i32>] = cell_slice.as_slice_of_cells();
    
    let two = &slice_cell[1];
    let another_two = &slice_cell[1];

    println!("This is 2: {:?}", two);
    println!("This is also 2: {:?}", another_two);
    
    two.set(42);
    println!("This is now 42!: {:?}", another_two);
}

这篇关于同时可变访问保证不相交的大向量的任意索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆