Spark中的爆炸结构 [英] Exploded Struct in Spark
本文介绍了Spark中的爆炸结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的DataFrame具有以下架构:
I have DataFrame with following schema:
|-- data: struct (nullable = true)
| |-- asin: string (nullable = true)
| |-- customerId: long (nullable = true)
| |-- eventTime: long (nullable = true)
| |-- marketplaceId: long (nullable = true)
| |-- rating: long (nullable = true)
| |-- region: string (nullable = true)
| |-- type: string (nullable = true)
|-- uploadedDate: long (nullable = true)
我想爆炸该结构,以便所有元素(如asin,customerId,eventTime)成为DataFrame中的列.我试过了explode函数,但是它可以在Array上而不是在struct类型上使用.是否可以将有能力的数据帧转换为以下数据帧:
I want to explode the struct such that all elements like asin, customerId, eventTime become the columns in DataFrame. I tried explode function but it works on Array not on struct type. Is it possible to convert the able data frame to below dataframe:
|-- asin: string (nullable = true)
|-- customerId: long (nullable = true)
|-- eventTime: long (nullable = true)
|-- marketplaceId: long (nullable = true)
|-- rating: long (nullable = true)
|-- region: string (nullable = true)
|-- type: string (nullable = true)
|-- uploadedDate: long (nullable = true)
推荐答案
这很简单:
val newDF = df.select("uploadedDate", "data.*");
您告诉您选择上载日期,然后选择字段数据的所有子元素
You tell to select uploadedDate and then all subelements of field data
示例:
scala> case class A(a: Int, b: Double)
scala> val df = Seq((A(1, 1.0), "1"), (A(2, 2.0), "2")).toDF("data", "uploadedDate")
scala> val newDF = df.select("uploadedDate", "data.*")
scala> newDF.show()
+------------+---+---+
|uploadedDate| a| b|
+------------+---+---+
| 1| 1|1.0|
| 2| 2|2.0|
+------------+---+---+
scala> newDF.printSchema()
root
|-- uploadedDate: string (nullable = true)
|-- a: integer (nullable = true)
|-- b: double (nullable = true)
这篇关于Spark中的爆炸结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文