如何在转星火的RDD [英] How to transpose an RDD in Spark

查看：159 发布时间：2016/5/22 15:12:21 scala apache-spark rdd

本文介绍了如何在转星火的RDD的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个这样的RDD：

I have an RDD like this:

1 2 3
4 5 6
7 8 9

有一个矩阵。现在我想转的RDD是这样的：

It is a matrix. Now I want to transpose the RDD like this:

1 4 7
2 5 8
3 6 9

我怎样才能做到这一点？

How can I do this?

推荐答案

假设你有一个N＆次;×M矩阵

Say you have an N×M matrix.

如果N和M都非常小，你可以按住N'倍;存储器M的项目，它并没有多大意义，使用RDD。但换位很容易：

If both N and M are so small that you can hold N×M items in memory, it doesn't make much sense to use an RDD. But transposing it is easy:

val rdd = sc.parallelize(Seq(Seq(1, 2, 3), Seq(4, 5, 6), Seq(7, 8, 9)))
val transposed = sc.parallelize(rdd.collect.toSeq.transpose)

如果N或M是如此之大，你不能在内存中保留N或者M的条目，那么你就不能有这种规模的RDD线。无论是原始或转置矩阵是不可能在此情况下，重新present

If N or M is so large that you cannot hold N or M entries in memory, then you cannot have an RDD line of this size. Either the original or the transposed matrix is impossible to represent in this case.

N和M可以是一个中等规模的：你可以在内存中保留N或者M条目，但你不能持有N'次，M项。在这种情况下，你必须炸毁矩阵，并再次把它在一起：

N and M may be of a medium size: you can hold N or M entries in memory, but you cannot hold N×M entries. In this case you have to blow up the matrix and put it together again:

val rdd = sc.parallelize(Seq(Seq(1, 2, 3), Seq(4, 5, 6), Seq(7, 8, 9)))
// Split the matrix into one number per line.
val byColumnAndRow = rdd.zipWithIndex.flatMap {
  case (row, rowIndex) => row.zipWithIndex.map {
    case (number, columnIndex) => columnIndex -> (rowIndex, number)
  }
}
// Build up the transposed matrix. Group and sort by column index first.
val byColumn = byColumnAndRow.groupByKey.sortByKey().values
// Then sort by row index.
val transposed = byColumn.map {
  indexedRow => indexedRow.toSeq.sortBy(_._1).map(_._2)
}

这篇关于如何在转星火的RDD的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在转星火的RDD [英] How to transpose an RDD in Spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在转星火的RDD [英] How to transpose an RDD in Spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭