消耗不重叠的矢量块,并合并结果 [英] Consume non-overlapping vector chunks, and combine results

查看:54
本文介绍了消耗不重叠的矢量块,并合并结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过使用线程来加速大向量上的昂贵计算.我的函数使用一个向量,计算一个新值的向量(它不聚合,但必须保留输入顺序),然​​后返回它.但是,我正在努力弄清楚如何生成线程,为每个线程分配向量切片,​​然后收集并合并结果.

I'm trying to speed up an expensive computation on a large vector by using threads. My function consumes a vector, computes a vector of new values (it doesn't aggregate, but input order has to be retained), and returns it. However, I'm struggling to figure out how to spawn threads, assign vector slices to each, and then collect and combine the results.

// tunable
const NUMTHREADS: i32 = 4;

fn f(val: i32) -> i32 {
    // expensive computation
    let res = val + 1;
    res

}

fn main() {
    // choose an odd number of elements
    let orig = (1..14).collect::<Vec<i32>>();
    let mut result: Vec<Vec<i32>> = vec!();
    let mut flat: Vec<i32> = Vec::with_capacity(orig.len());
    // split into slices
    for chunk in orig.chunks(orig.len() / NUMTHREADS as usize) {
        result.push(
            chunk.iter().map(|&digit|
                f(digit)).collect()
            );
    };
    // flatten result vector
    for subvec in result.iter() {
        for elem in subvec.iter() {
            flat.push(elem.to_owned());
        }
    }
    println!("Flattened result: {:?}", flat);
}

线程计算应该在for chunk…// flatten …之间进行,但是我找不到许多简单的示例,它们产生了x个线程,按顺序分配了块,并将新计算的向量从线程中返回到了线程中.一个容器,以便可以展平.我是否必须将orig.chunks()包装在Arc中,并手动在循环中抓取每个块?我是否必须将f传递到每个线程中?我是否必须使用B树来确保输入和输出顺序匹配?我可以只使用 simple_parallel 吗?

The threaded computation should be taking place between for chunk… and // flatten …, but I can't find many simple examples of spawning x threads, assigning chunks sequentially, and returning the newly-computed vector out of the thread and into a container so it can be flattened. Do I have to wrap orig.chunks() in an Arc, and manually grab each chunk in a loop? Do I have to pass f into each thread? Will I have to use a B-Tree to ensure that input and output order match? Can I just use simple_parallel?

推荐答案

好,对于不稳定的

Well, this is an ideal application for the unstable thread::scoped():

#![feature(scoped)]
use std::thread::{self, JoinGuard};

// tunable
const NUMTHREADS: i32 = 4;

fn f(val: i32) -> i32 {
    // expensive computation
    let res = val + 1;
    res
}

fn main() {
    // choose an odd number of elements
    let orig: Vec<i32> = (1..14).collect();

    let mut guards: Vec<JoinGuard<Vec<i32>>> = vec!();

    // split into slices
    for chunk in orig.chunks(orig.len() / NUMTHREADS as usize) {
        let g = thread::scoped(move || chunk.iter().cloned().map(f).collect());
        guards.push(g);
    };

    // collect the results
    let mut result: Vec<i32> = Vec::with_capacity(orig.len());
    for g in guards {
        result.extend(g.join().into_iter());
    }

    println!("Flattened result: {:?}", result);
}

它是不稳定的,因为它具有固有的缺陷,因此不太可能以这种形式稳定(您可以找到更多的此处).据我所知,simple_parallel只是该方法的扩展-隐藏了对JoinGuards的摆弄,还可以用于稳定的Rust(我相信可能带有unsafe ty).但是,不建议像它的文档所建议的那样一般使用它.

It is unstable and won't likely be stabilized in this form because it has an inherent flaw (you can find more here). As far as I can see, simple_parallel is just an extension of this approach - it hides the fiddling with JoinGuards and also can be used in stable Rust (probably with some unsafety, I believe). It is not recommended for the general use, however, as its docs suggest.

当然,您可以使用thread::spawn(),但是随后您将需要克隆每个块,以便可以将其移动到每个线程中:

Of course, you can use thread::spawn(), but then you will need to clone each chunk so it could be moved into each thread:

use std::thread::{self, JoinHandle};

// tunable
const NUMTHREADS: i32 = 4;

fn f(val: i32) -> i32 {
    // expensive computation
    let res = val + 1;
    res
}

fn main() {
    // choose an odd number of elements
    let orig: Vec<i32> = (1..14).collect();

    let mut guards: Vec<JoinHandle<Vec<i32>>> = vec!();

    // split into slices
    for chunk in orig.chunks(orig.len() / NUMTHREADS as usize) {
        let chunk = chunk.to_owned();
        let g = thread::spawn(move || chunk.into_iter().map(f).collect());
        guards.push(g);
    };

    // collect the results
    let mut result: Vec<i32> = Vec::with_capacity(orig.len());
    for g in guards {
        result.extend(g.join().unwrap().into_iter());
    }

    println!("Flattened result: {:?}", result);
}

这篇关于消耗不重叠的矢量块,并合并结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆