Vec 和盒装切片的性能比较 [英] Performance comparison of a Vec and a boxed slice

查看：84 发布时间：2021/6/15 19:02:48 performance rust

本文介绍了Vec 和盒装切片的性能比较的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想要一个函数

在堆上分配一个基本的可变长度数组"(在这个词的一般意义上，不一定是 Rust 类型)的浮点数
用值初始化它
实现Drop，这样我就不用担心释放内存了
为索引或迭代实现一些东西

allocate a basic variable-length "array" (in the generic sense of the word, not necessarily the Rust type) of floats on the heap
initialize it with values
implement Drop, so I don't have to worry about freeing memory
implement something for indexing or iterating

显而易见的选择是Vec，但它与堆上的盒装切片相比如何?Vec 更强大，但我需要用于数值数学的数组，在我的例子中，不需要像 push/pop 这样的东西.我们的想法是拥有功能较少但速度更快的东西.

The obvious choice is Vec, but how does it compare to a boxed slice on the heap? Vec is more powerful, but I need the array for numerical math and, in my case, don't need stuff like push/pop. The idea is to have something with less features, but faster.

下面我有两个版本的linspace"函数(a la Matlab 和 numpy)，

Below I have two versions of a "linspace" function (a la Matlab and numpy),

linspace_vec"(见下面的列表)使用 Vec
linspace_boxed_slice"(见下面的列表)使用盒装切片

两者都像

let y = linspace_*(start, stop, len);

其中 y 是一个长度为 len<的线性间隔数组"(即 (1) 中的 Vec 和 (2) 中的盒装切片)/代码>.


where y is a linearly spaced "array" (i.e. a Vec in (1) and a boxed slice in (2)) of length len.
对于长度为 1000 的小型数组"，(1) 更快.对于长度为 4*10^6 的大型数组，(1) 更慢.这是为什么?我在 (2) 中做错了什么吗?
For small "arrays" of length 1000, (1) is FASTER. For large arrays of length 4*10^6, (1) is SLOWER. Why is that? Am I doing something wrong in (2)?
当参数 len = 1000 时，仅通过调用函数进行基准测试
When the argument len = 1000, benchmarking by just calling the function results in
(1) ... bench: 879 ns/iter (+/- 12)
(2) ... bench: 1,295 ns/iter (+/- 38)
当参数 len = 4000000 时，基准测试结果为
When the argument len = 4000000, benchmarking results in
(1) ... bench: 5,802,836 ns/iter (+/- 90,209)
(2) ... bench: 4,767,234 ns/iter (+/- 121,596)
列表(1):
pub fn linspace_vec<'a, T: 'a>(start: T, stop: T, len: usize) -> Vec<T>
where
    T: Float,
{
    // get 0, 1 and the increment dx as T
    let (one, zero, dx) = get_values_as_type_t::<T>(start, stop, len);
    let mut v = vec![zero; len];
    let mut c = zero;
    let ptr: *mut T = v.as_mut_ptr();
    unsafe {
        for ii in 0..len {
            let x = ptr.offset((ii as isize));
            *x = start + c * dx;
            c = c + one;
        }
    }

    return v;
}

列表(2):
pub fn linspace_boxed_slice<'a, T: 'a>(start: T, stop: T, len: usize) -> Box<&'a mut [T]>
where
    T: Float,
{
    let (one, zero, dx) = get_values_as_type_t::<T>(start, stop, len);
    let size = len * mem::size_of::<T>();
    unsafe {
        let ptr = heap::allocate(size, align_of::<T>()) as *mut T;
        let mut c = zero;
        for ii in 0..len {
            let x = ptr.offset((ii as isize));
            *x = start + c * dx;
            c = c + one;
        }
        // IS THIS WHAT MAKES IT SLOW?:
        let sl = slice::from_raw_parts_mut(ptr, len);
        return Box::new(sl);
    }
}


推荐答案
在您的第二个版本中，您使用类型 Box<&'a mut [T]>，这意味着有到达 T 的两个间接级别，因为 Box 和 & 都是指针.
In your second version, you use the type Box<&'a mut [T]>, which means there are two levels of indirection to reach a T, because both Box and & are pointers.
你想要的是一个 Box<[T]>.我认为构造这样一个值的唯一合理方法是从 Vec 中，使用 into_boxed_slice 方法.请注意，唯一的好处是您会丢失 Vec 将具有的 capacity 字段.除非您需要同时在内存中拥有大量这样的数组，否则开销可能微不足道.
What you want instead is a Box<[T]>. I think the only sane way to construct such a value is from a Vec<T>, using the into_boxed_slice method. Note that the only benefit is that you lose the capacity field that a Vec would have. Unless you need to have a lot of these arrays in memory at the same time, the overhead is likely to be insignificant.
pub fn linspace_vec<'a, T: 'a>(start: T, stop: T, len: usize) -> Box<[T]>
where
    T: Float,
{
    // get 0, 1 and the increment dx as T
    let (one, zero, dx) = get_values_as_type_t::<T>(start, stop, len);
    let mut v = vec![zero; len].into_boxed_slice();
    let mut c = zero;
    let ptr: *mut T = v.as_mut_ptr();
    unsafe {
        for ii in 0..len {
            let x = ptr.offset((ii as isize));
            *x = start + c * dx;
            c = c + one;
        }
    }

    v
}


                        这篇关于Vec 和盒装切片的性能比较的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

Vec 和盒装切片的性能比较 [英] Performance comparison of a Vec and a boxed slice

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Vec 和盒装切片的性能比较 [英] Performance comparison of a Vec and a boxed slice

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭