在Spark中分解结构列时出错 [英] Error while exploding a struct column in Spark

查看:613
本文介绍了在Spark中分解结构列时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框架,其架构如下所示:

I have a dataframe whose schema looks like this:

event: struct (nullable = true)
|    | event_category: string (nullable = true)
|    | event_name: string (nullable = true)
|    | properties: struct (nullable = true)
|    |    | ErrorCode: string (nullable = true)
|    |    | ErrorDescription: string (nullable = true)

我正尝试使用以下代码爆炸structproperties:

I am trying to explode the struct column properties using the following code:

df_json.withColumn("event_properties", explode($"event.properties"))

但是它引发了以下异常:

But it is throwing the following exception:

cannot resolve 'explode(`event`.`properties`)' due to data type mismatch: 
input to function explode should be array or map type, 
not StructType(StructField(IDFA,StringType,true),

如何展开properties列?

推荐答案

您可以在arraymap 中使用explode,因此需要转换properties structarray,然后按如下所示应用explode函数

You can use explode in an array or map columns so you need to convert the properties struct to array and then apply the explode function as below

import org.apache.spark.sql.functions._
df_json.withColumn("event_properties", explode(array($"event.properties.*"))).show(false)

您应该有所需的要求

这篇关于在Spark中分解结构列时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆