矩阵乘法:矩阵大小差异小，时序差异大 [英] Matrix multiplication: Small difference in matrix size, large difference in timings

查看：41 发布时间：2021/12/6 19:52:15 c performance algorithm matrix-multiplication

本文介绍了矩阵乘法:矩阵大小差异小，时序差异大的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个矩阵乘法代码，如下所示:

I have a matrix multiply code that looks like this:

for(i = 0; i < dimension; i++)
    for(j = 0; j < dimension; j++)
        for(k = 0; k < dimension; k++)
            C[dimension*i+j] += A[dimension*i+k] * B[dimension*k+j];

这里，矩阵的大小由dimension表示.现在，如果矩阵的大小为 2000，则运行这段代码需要 147 秒，而如果矩阵的大小为 2048，则需要 447 秒.所以虽然没有区别.乘法的次数是 (2048*2048*2048)/(2000*2000*2000) = 1.073，时间差是 447/147 = 3.有人可以解释为什么会发生这种情况吗?我预计它会线性扩展，这不会发生.我不是想制作最快的矩阵乘法代码，只是想了解它为什么会发生.

Here, the size of the matrix is represented by dimension. Now, if the size of the matrices is 2000, it takes 147 seconds to run this piece of code, whereas if the size of the matrices is 2048, it takes 447 seconds. So while the difference in no. of multiplications is (2048*2048*2048)/(2000*2000*2000) = 1.073, the difference in the timings is 447/147 = 3. Can someone explain why this happens? I expected it to scale linearly, which does not happen. I am not trying to make the fastest matrix multiply code, simply trying to understand why it happens.

规格:AMD Opteron 双核节点 (2.2GHz)、2G RAM、gcc v 4.5.0

Specs: AMD Opteron dual core node (2.2GHz), 2G RAM, gcc v 4.5.0

程序编译为gcc -O3 simple.c

我也在 Intel 的 icc 编译器上运行过它，并看到了类似的结果.

I have run this on Intel's icc compiler as well, and seen similar results.

正如评论/答案中所建议的，我运行了维度=2060 的代码，它需要 145 秒.

As suggested in the comments/answers, I ran the code with dimension=2060 and it takes 145 seconds.

这是完整的程序:

#include <stdlib.h>
#include <stdio.h>
#include <sys/time.h>

/* change dimension size as needed */
const int dimension = 2048;
struct timeval tv; 

double timestamp()
{
        double t;
        gettimeofday(&tv, NULL);
        t = tv.tv_sec + (tv.tv_usec/1000000.0);
        return t;
}

int main(int argc, char *argv[])
{
        int i, j, k;
        double *A, *B, *C, start, end;

        A = (double*)malloc(dimension*dimension*sizeof(double));
        B = (double*)malloc(dimension*dimension*sizeof(double));
        C = (double*)malloc(dimension*dimension*sizeof(double));

        srand(292);

        for(i = 0; i < dimension; i++)
                for(j = 0; j < dimension; j++)
                {   
                        A[dimension*i+j] = (rand()/(RAND_MAX + 1.0));
                        B[dimension*i+j] = (rand()/(RAND_MAX + 1.0));
                        C[dimension*i+j] = 0.0;
                }   

        start = timestamp();
        for(i = 0; i < dimension; i++)
                for(j = 0; j < dimension; j++)
                        for(k = 0; k < dimension; k++)
                                C[dimension*i+j] += A[dimension*i+k] *
                                        B[dimension*k+j];

        end = timestamp();
        printf("
secs:%f
", end-start);

        free(A);
        free(B);
        free(C);

        return 0;
}

矩阵乘法:矩阵大小差异小，时序差异大 [英] Matrix multiplication: Small difference in matrix size, large difference in timings

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

矩阵乘法:矩阵大小差异小，时序差异大 [英] Matrix multiplication: Small difference in matrix size, large difference in timings

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭