在 Spark 中分解结构列时出错 [英] Error while exploding a struct column in Spark

查看:31
本文介绍了在 Spark 中分解结构列时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其架构如下所示:

event: struct (nullable = true)||event_category: 字符串 (nullable = true)||事件名称:字符串(可为空 = 真)||属性:结构(可为空=真)|||错误代码:字符串(可为空 = 真)|||错误描述:字符串(可为空 = 真)

我正在尝试使用以下代码分解 structproperties:

df_json.withColumn("event_properties",explode($"event.properties"))

但它抛出以下异常:

<块引用>由于数据类型不匹配,

无法解析explode(`event`.`properties`)":函数explode的输入应该是数组或地图类型,不是 StructType(StructField(IDFA,StringType,true),

如何爆列properties?

解决方案

您可以在 arraymap 中使用 explode 所以你需要将 properties struct 转换为 array 然后应用 explode 函数作为下面

import org.apache.spark.sql.functions._df_json.withColumn("event_properties",explode(array($"event.properties.*"))).show(false)

你应该有你想要的需求

I have a dataframe whose schema looks like this:

event: struct (nullable = true)
|    | event_category: string (nullable = true)
|    | event_name: string (nullable = true)
|    | properties: struct (nullable = true)
|    |    | ErrorCode: string (nullable = true)
|    |    | ErrorDescription: string (nullable = true)

I am trying to explode the struct column properties using the following code:

df_json.withColumn("event_properties", explode($"event.properties"))

But it is throwing the following exception:

cannot resolve 'explode(`event`.`properties`)' due to data type mismatch: 
input to function explode should be array or map type, 
not StructType(StructField(IDFA,StringType,true),

How to explode the column properties?

解决方案

You can use explode in an array or map columns so you need to convert the properties struct to array and then apply the explode function as below

import org.apache.spark.sql.functions._
df_json.withColumn("event_properties", explode(array($"event.properties.*"))).show(false)

You should have your desired requirement

这篇关于在 Spark 中分解结构列时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆