将 StructType 分解为 MapType Spark [英] Exploding StructType as MapType Spark
本文介绍了将 StructType 分解为 MapType Spark的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在 Spark 中将 structType 转换为 MapType.
Converting structType to MapType in Spark.
架构:
event: struct (nullable = true)
| | event_category: string (nullable = true)
| | event_name: string (nullable = true)
| | properties: struct (nullable = true)
| | | prop1: string (nullable = true)
| | | prop2: string (nullable = true)
示例数据:
{ "event": {
"event_category: "abc",
"event_name": "click",
"properties" : {
"prop1": "prop1Value",
"prop2": "prop2Value",
....
}
}
}
需要如下值:
event_category | event_name | properties_key | properties_value |
abc | click | prop1 | prop1Value
abc | click | prop2 | prop2Value
推荐答案
你必须找到一些机制来创建properties
的map
struct.我使用了 udf
函数来 zip
key 和 values 并返回 arrays
键和值.
You will have to find some mechanism to create map
of properties
struct. I have used udf
function to zip
the key and values and return arrays
of key and value.
import org.apache.spark.sql.functions._
def collectUdf = udf((cols: collection.mutable.WrappedArray[String], values: collection.mutable.WrappedArray[String]) => cols.zip(values))
spark 不支持多个生成器,因此您必须将 dataframe
保存到临时 dataframe
.
val columnsMap = df_json.select($"event.properties.*").columns
val temp = df_json.withColumn("event_properties", explode(collectUdf(lit(columnsMap), array($"event.properties.*"))))
最后一步是将 event_properties
列分开
The last step would be to just separate the event_properties
column
temp.select($"event.event_category", $"event.event_name", $"event_properties._1".as("properties_key"), $"event_properties._2".as("properties_value")).show(false)
你应该拥有你想要的
+--------------+----------+--------------+----------------+
|event_category|event_name|properties_key|properties_value|
+--------------+----------+--------------+----------------+
|abc |click |prop1 |prop1Value |
|abc |click |prop2 |prop2Value |
+--------------+----------+--------------+----------------+
这篇关于将 StructType 分解为 MapType Spark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文