在C克罗内克乘积计算效率 [英] Efficient computation of kronecker products in C

查看:301
本文介绍了在C克罗内克乘积计算效率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是相当新的C,没有太多需要什么比蟒蛇快于大多数我的研究。然而,事实证明,最近的工作我已经做需要相当大的向量/矩阵的计算,并因此有一个C + MPI的解决方案可能是为了。

I'm fairly new to C, not having much need to anything faster than python for most of my research. However, it turns out that recent work I've been doing required the computation of fairly large vectors/matrices, and there therefore a C+MPI solution might be in order.

从数学上来说,任务是很简单的。我有很多维度〜40K的载体,并希望计算 Kronecker积的选择对这些载体,再总结这些克罗内克乘积。

Mathematically speaking, the task is very simple. I have a lot of vectors of dimensionality ~40k and wish to compute the Kronecker Product of selected pairs of these vectors, and then sum these kronecker products.

问题是,如何有效地做到这一点?这有什么错code以下结构,使用循环,或取得的效果?

The question is, how to do this efficiently? Is there anything wrong with the following structure of code, using for loops, or obtain the effect?

下面所描述的功能 KRON 通过矢量 A B 长度 vector_size ,并计算它们的直积,它存储在 C ,一个 vector_size * vector_size 矩阵。

The function kron described below passes vectors A and B of lengths vector_size, and computes their kronecker product, which it stores in C, a vector_size*vector_size matrix.

void kron(int *A, int *B, int *C, int vector_size) {

    int i,j;

    for(i = 0; i < vector_size; i++) {
        for (j = 0; j < vector_size; j++) {
            C[i*vector_size+j] = A[i] * B[j];
        }
    }
    return;
}

这似乎没什么问题,当然(如果我没有做一些愚蠢的语法错误)产生正确的结果,但我有一个偷渡怀疑嵌入式for循环不是最佳的。如果有另一种方式我应该要对此,请让我知道。建议表示欢迎。

This seems fine to me, and certainly (if I've not made some silly syntax error) produce the right result, but I have a sneaking suspicion that embedded for loops is not optimal. If there's another way I should be going about this, please let me know. Suggestions welcome.

我感谢你的耐心,你可能有任何意见。再一次,我非常缺乏经验与C,但周围的Googling给我带来了多少欢乐此查询。

I thank you for you patience and any advice you may have. Once again, I'm very inexperienced with C, but Googling around has brought me little joy for this query.

推荐答案

有关双precision载体(单precision和复杂的都差不多),你可以使用BLAS例行 DGER (秩一更新)或类似做产品之一 - 在 - 一时间,因为他们都在向量。你有多少乘以载体?请记住,加入一堆矢量外积(你可以把克罗内克产品作为)一个矩阵的矩阵乘法,这BLAS的 DGEMM 可以有效地处理结束。您可能需要编写自己的程序,如果你真正需要的整数运算,虽然。

For double-precision vectors (single-precision and complex are similar), you can use the BLAS routine DGER (rank-one update) or similar to do the products one-at-a-time, since they are all on vectors. How many vectors are you multiplying? Remember that adding a bunch of vector outer products (which you can treat the Kronecker products as) ends up as a matrix-matrix multiplication, which BLAS's DGEMM can handle efficiently. You might need to write your own routines if you truly need integer operations, though.

这篇关于在C克罗内克乘积计算效率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆