矩阵在Spark中的RowMatrix上转置 [英] Matrix Transpose on RowMatrix in Spark
问题描述
假设我有一个RowMatrix.
Suppose I have a RowMatrix.
- 我如何转置它. API文档似乎没有转置方法.
- Matrix具有transpose()方法.但是它不是分布式的.如果我有一个比内存大的大矩阵,该如何转置它?
-
我已将RowMatrix转换为DenseMatrix如下
- How can I transpose it. The API documentation does not seem to have a transpose method.
- The Matrix has the transpose() method. But it is not distributed. If I have a large matrix greater that the memory how can I transpose it?
I have converted a RowMatrix to DenseMatrix as follows
DenseMatrix Mat = new DenseMatrix(m,n,MatArr);
这需要将RowMatrix转换为JavaRDD并将JavaRDD转换为数组.
which requires converting the RowMatrix to JavaRDD and converting JavaRDD to an array.
还有其他方便的转换方法吗?
Is there any other convenient way to do the conversion?
预先感谢
推荐答案
您是正确的:没有
RowMatrix.transpose()
方法.您将需要手动执行此操作.
method. You will need to do this operation manually.
这是 非分布式/本地 矩阵版本:
Here is the non-distributed/local matrix versions:
def transpose(m: Array[Array[Double]]): Array[Array[Double]] = {
(for {
c <- m(0).indices
} yield m.map(_(c)) ).toArray
}
分布式版本 如下:
The distributed version would be along the following lines:
origMatRdd.rows.zipWithIndex.map{ case (rvect, i) =>
rvect.zipWithIndex.map{ case (ax, j) => ((j,(i,ax))
}.groupByKey
.sortBy{ case (i, ax) => i }
.foldByKey(new DenseVector(origMatRdd.numRows())) { case (dv, (ix,ax)) =>
dv(ix) = ax
}
注意事项:我尚未测试以上内容:将会存在错误.但是基本方法是有效的-与我过去为小型Spark的LinAlg库所做的工作类似.
Caveat: I have not tested the above: it will have bugs. But the basic approach is valid - and similar to work I had done in the past for a small LinAlg library for spark.
这篇关于矩阵在Spark中的RowMatrix上转置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!