在C / C简单而快速的矩阵向量乘法++ [英] Simple and fast matrix-vector multiplication in C / C++

查看：298 发布时间：2016/8/21 21:08:24 c++ c matrix

本文介绍了在C / C简单而快速的矩阵向量乘法++的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要的频繁使用matrix_vector_mult（）其中矢量矩阵相乘，以下是其执行情况。

问：有没有一种简单的方法，使之显著，至少两次，更快

备注：1）基体的大小是约300x50的。它在不改变
跑。 2）必须在Windows和Linux的工作。

 双vectors_dot_prod（常量双* X，常量双* Y，INT N）
{
    双解析度= 0.0;
    INT I;
    对于（i = 0; I＆LT; N;我++）
    {
        RES + = X [I] * Y [I]
    }
    返回水库;
}无效matrix_vector_mult（常量双**垫，常量双* VEC，双*结果，诠释行，诠释COLS）
{//矩阵形式：结果=垫* VEC;
    INT I;
    对于（i = 0; I＆LT;行;我++）
    {
        结果[I] = vectors_dot_prod（垫[I]，VEC，COLS）;
    }
}

解决方案

这是什么，在理论上一个很好的编译器应该自行完成的，但是我做了一个尝试用我的系统（G ++ 4.6.3），并有大约两倍速度上的尺寸为300x50矩阵由专人展开4次乘法（约每矩阵，而不是每个矩阵34us 18us）：

 双vectors_dot_prod2（常量双* X，常量双* Y，INT N）
{
    双解析度= 0.0;
    INT I = 0;
    对于（; I＆LT; = N-4，I + = 4）
    {
        RES + =（X [I] * Y [I] +
                X [I + 1] *值Y [i + 1] +
                X [I + 2] *值Y [i + 2] +
                ×〔I + 3] * Y [i + 3中]）;
    }
    对于（; I＆LT; N;我++）
    {
        RES + = X [I] * Y [I]
    }
    返回水库;
}

我期望然而这个级别的微优化的结果，以系统之间变化很大。

I need frequent usage of matrix_vector_mult() which multiplies matrix with vector, and below is its implementation.

Question: Is there a simple way to make it significantly, at least twice, faster?

Remarks: 1) The size of the matrix is about 300x50. It doesn't change during the run. 2) It must work on both Windows and Linux.

double vectors_dot_prod(const double *x, const double *y, int n)
{
    double res = 0.0;
    int i;
    for (i = 0; i < n; i++)
    {
        res += x[i] * y[i];
    }
    return res;
}

void matrix_vector_mult(const double **mat, const double *vec, double *result, int rows, int cols)
{ // in matrix form: result = mat * vec;
    int i;
    for (i = 0; i < rows; i++)
    {
        result[i] = vectors_dot_prod(mat[i], vec, cols);
    }
}

解决方案

This is something that in theory a good compiler should do by itself, however I made a try with my system (g++ 4.6.3) and got about twice the speed on a 300x50 matrix by hand unrolling 4 multiplications (about 18us per matrix instead of 34us per matrix):

double vectors_dot_prod2(const double *x, const double *y, int n)
{
    double res = 0.0;
    int i = 0;
    for (; i <= n-4; i+=4)
    {
        res += (x[i] * y[i] +
                x[i+1] * y[i+1] +
                x[i+2] * y[i+2] +
                x[i+3] * y[i+3]);
    }
    for (; i < n; i++)
    {
        res += x[i] * y[i];
    }
    return res;
}

I expect however the results of this level of micro-optimization to vary wildly between systems.

这篇关于在C / C简单而快速的矩阵向量乘法++的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在C / C简单而快速的矩阵向量乘法++ [英] Simple and fast matrix-vector multiplication in C / C++

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

在C / C简单而快速的矩阵向量乘法++ [英] Simple and fast matrix-vector multiplication in C / C++

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭