N次复制Spark Row [英] Replicate Spark Row N-times

查看：127 发布时间：2020/9/4 1:12:29 scala apache-spark

本文介绍了N次复制Spark Row的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在DataFrame中复制一行，该怎么做?

I want to duplicate a Row in a DataFrame, how can I do that?

例如，我有一个由1行组成的DataFrame，并且我想制作一个具有100个相同行的DataFrame.我想出了以下解决方案:

For example, I have a DataFrame consisting of 1 Row, and I want to make a DataFrame with 100 identical Rows. I came up with the following solution:

  var data:DataFrame=singleRowDF

   for(i<-1 to 100-1) {
       data = data.unionAll(singleRowDF)
   }

但这引入了许多转换，而且看来我随后的动作非常缓慢.还有另一种方法吗?

But this introduces many transformations and it seems my subsequent actions become very slow. Is there another way to do it?

推荐答案

您可以添加一列，其字面值为Array，其大小为100，然后使用explode使其每个元素创建自己的行；然后，只需删除此虚拟"列即可:

You can add a column with a literal value of an Array with size 100, and then use explode to make each of its elements create its own row; Then, just get rid of this "dummy" column:

import org.apache.spark.sql.functions._

val result = singleRowDF
  .withColumn("dummy", explode(array((1 until 100).map(lit): _*)))
  .selectExpr(singleRowDF.columns: _*)

这篇关于N次复制Spark Row的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

N次复制Spark Row [英] Replicate Spark Row N-times

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

N次复制Spark Row [英] Replicate Spark Row N-times

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭