在 SPARK SCALA 中按名称获取行类型结构的元素 [英] Get elements of type structure of row by name in SPARK SCALA

查看:22
本文介绍了在 SPARK SCALA 中按名称获取行类型结构的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Apache Spark 的 DataFrame 对象中(我使用的是 Scala 接口),如果我迭代它的 Row 对象,有没有办法按名称提取结构值?

In a DataFrame object in Apache Spark (I'm using the Scala interface), if I'm iterating over its Row objects, is there any way to extract structure values by name?

我正在使用下面的代码按名称提取,但我在如何读取结构值方面遇到了问题.

I am using the below code to extract by name but I am facing problem on how to read the struct value .

如果值是字符串类型,那么我们可以这样做:

If values had been of type string then we could have done this:

 val resultDF=joinedDF.rdd.map{row=> 
      val id=row.getAs[Long]("id")
      val values=row.getAs[String]("slotSize")
      val feilds=row.getAs[String](values)
      (id,values,feilds)
      }.toDF("id","values","feilds")

但在我的情况下,值具有以下架构

But in my case values has the below schema

v1: struct (nullable = true)
     |    |-- level1: string (nullable = true)
     |    |-- level2: string (nullable = true)
     |    |-- level3: string (nullable = true)
     |    |-- level4: string (nullable = true)
     |    |-- level5: string (nullable = true)

鉴于该值具有上述结构,我应该用什么替换此行以使代码工作.

What shall I replace this line with to make the code work given that value has the above structure.

  row.getAs[String](values)

推荐答案

您可以访问 struct 元素我首先提取另一个 Row(结构被建模为另一个 Row in spark) 从顶层 Row 像这样:

You can access the struct elements my first extracting another Row (structs are modeled as another Row in spark) from the toplevel Row like this:

Scala 实现

val level1 = row.getAs[Row]("struct").getAs[String]("level1")

Java 实现

 String level1 = f.<Row>getAs("struct).getAs("level1").toString();  

这篇关于在 SPARK SCALA 中按名称获取行类型结构的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆