Spark中Dataframe的行操作 [英] Row manipulation for Dataframe in spark

查看：157 发布时间：2020/9/4 5:16:41 scala apache-spark dataframe apache-spark-sql

本文介绍了Spark中Dataframe的行操作的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在spark中有一个数据框，就像:

I have a dataframe in spark which is like :

 column_A | column_B
 ---------  --------
  1          1,12,21
  2          6,9

column_A和column_B均为字符串类型.

如何将上述数据框转换为新的数据框，如:

how can I convert the above dataframe to a new dataframe which is like :

  colum_new_A | column_new_B
  -----------   ------------
     1             1
     1             12
     1             21
     2             6
     2             9

column_new_A和column_new_B都应为String类型.

both column_new_A and column_new_B should be of String type.

推荐答案

您需要使用comma split Column_B并将explode函数用作

You need to split the Column_B with comma and use the explode function as

val df = Seq(
  ("1", "1,12,21"),
  ("2", "6,9")
).toDF("column_A", "column_B")

您可以使用withColumn或select创建新的column.

You can use withColumn or select to create new column.

df.withColumn("column_B", explode(split( $"column_B", ","))).show(false)

df.select($"column_A".as("column_new_A"), explode(split( $"column_B", ",")).as("column_new_B"))

输出:

+------------+------------+
|column_new_A|column_new_B|
+------------+------------+
|1           |1           |
|1           |12          |
|1           |21          |
|2           |6           |
|2           |9           |
+------------+------------+

这篇关于Spark中Dataframe的行操作的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Spark中Dataframe的行操作 [英] Row manipulation for Dataframe in spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spark中Dataframe的行操作 [英] Row manipulation for Dataframe in spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭