计算稀疏矩阵的时间复杂度(2) [英] Computing time complexity of the sparse matrix (2)

查看:1201
本文介绍了计算稀疏矩阵的时间复杂度(2)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有(nxd)的数据集(D),其中n =行数,d =维数,我通过比较数据集(D)的每一行来创建相似度矩阵(S)(nxn)然后将其转换为稀疏矩阵(tx3),其中t是对称相似性矩阵(S)的非零元素数

I have a data set (D) of (nxd) where n=number of rows and d= number of dimensions, I create a similarity matrix (S)(nxn) by comparing each row of the data set (D) and then convert it into a sparse matrix (tx3) where t is the number of non-zero elements of the symmetric similarity matrix (S)

创建相似度矩阵的时间复杂度为o(n ^ 2d),其中d是一些常数运算. 转换稀疏矩阵的时间复杂度为theta(n ^ 2)

The time complexity of creating the similarity matrix is o(n^2d) where d is some constant operation. The time complexity of converting a sparse matrix is theta(n^2)

我的问题是: 在创建相似性矩阵时,如果执行如果相似性值为零",则继续(继续),否则将其放入稀疏矩阵"中进行检查.假设可以这样说,从数据集(D)计算稀疏矩阵的成本为O(n ^ 2 d).

My question is: While creating the similarity matrix if I perform a check that "if the similarity value is "zero" then proceed (continue) else put it into the sparse matrix". Assuming this can I say that the cost of computing the sparse matrix from the dataset (D) is O(n^2 d).

例如:

创建相似度矩阵:

for i in range(0,n):
    for j in range(0,n):
        find similarity_value of D[i] and D[j]
        insert into similarity_matrix: S[i,j]= similarity_value

The above runs in O(n^2 d)
    n^2 for the loops
    d   for finding the similarity between D[i] and D[j]

稀疏矩阵创建形式相似度矩阵

Sparse Matrix creation form Simiarity matrix

for i in range(0,n):
    for j in range(0,n):
        if S[i,j]==0:
            continue
        else
            insert into sparse_matrix [i, j, S[i,j]]

The above runs in O(n^2)
    n^2 for the loops

同时执行这两项操作将需要O(n ^ 2 d)+ O(n ^ 2).

Performing both the operation would require O(n^2 d) +O(n^2) if done one after another.

由于我们只需要sparse_matrix,因此我们直接创建稀疏矩阵,而无需创建相似性矩阵.

Since we require only the sparse_matrix, we create the sparse matrix directly without creating the similarity matrix.

直接创建稀疏矩阵而无需创建相似性矩阵:

Creating Sparse matrix directly without creating the similarity matrix:

for i in range(0,n):
    for j in range(0,n):
        find similarity_val of D[i] and D[j]
        if similarity_val==0:
            continue
        else
            insert into sparse_matrix [i,j,similarity_val]

我的问题是:

Wouldn't the above run in only O(n^2 d), since I am directly inserting into sparse matrix
    n^2  for the two loops
    d    for finding the similarity_val of D[i] and D[j]

请让我知道我是否缺少某些东西或我对某些东西的理解是错误的.

Please let me know if I am missing something or my understanding of something is wrong.

推荐答案

对于每对(i,j)对(总共n ^ 2),您都会到达循环的内部,在其中找到相似性,然后有条件地将元素添加到稀疏矩阵.查找相似性需要执行"d"个操作(因为您需要遍历每个维度),有条件地添加元素需要进行一定数量的操作(在值为0的情况下为1个比较操作,在值为0的情况下为1个比较操作加一个值不为零的情况下进行插入操作).由于每次到达此双循环的内部都需要执行"d"操作,并且要执行恒定数量的操作,因此总共需要执行O(n ^ 2 d)个操作.

For every (i, j) pair (there are n^2 in total), you reach the inner part of the loop where you find the similarity and then conditionally add elements to your sparse matrix. Finding the similarity takes "d" operations (because you need to loop through each of your dimensions) and conditionally adding an element takes a constant number of operations (either 1 comparison operation in the case where the value is 0 and 1 comparison operation plus one insertion operation in the case where the value is non-zero). Since you need to do "d" plus a constant number of operations each time you reach the inside of this double loop, in total you perform O(n^2 d) operations.

请注意,如果将内部循环限制为不小于i的j值(也可以将for j in range(0, n)替换为for j in range(i, n)),则此渐近运算计数将不会更改.这是因为您将到达循环内部n *(n + 1)/2次并执行"d"加上一定数量的运算,这仍然是O(n ^ 2 d)次总计算.

Note that this asymptotic operation count will not change if you limit the inner loop to values of j that are no less than i (aka replace for j in range(0, n) with for j in range(i, n)). This is because you will reach the inside of the loop n*(n+1)/2 times and perform "d" plus a constant number of operations, which is still O(n^2 d) total computation.

这篇关于计算稀疏矩阵的时间复杂度(2)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆