在Spark Scala中将Row转换为地图 [英] Convert Row to map in spark scala

查看:96
本文介绍了在Spark Scala中将Row转换为地图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在数据框中有一行,我想将其转换为Map [String,Any],该映射将列名映射到该列的行中的值.

I have a row from a data frame and I want to convert it to a Map[String, Any] that maps column names to the values in the row for that column.

有简单的方法吗?

我是针对

def rowToMap(row:Row): Map[String, String] = {
row.schema.fieldNames.map(field => field -> row.getAs[String](field)).toMap
}

val myRowMap = rowToMap(myRow)

如果该行包含其他值,而不是诸如String之类的特定值,则代码将变得更加混乱,因为该行没有方法 .get(field)

If the row contains other values, not specific ones like String then the code gets messier because the row does not have a a method .get(field)

有什么想法吗?

推荐答案

您可以使用 getValuesMap :

val df = Seq((1, 2.0, "a")).toDF("A", "B", "C")    
val row = df.first

要获取 Map [String,Any] :

row.getValuesMap[Any](row.schema.fieldNames)
// res19: Map[String,Any] = Map(A -> 1, B -> 2.0, C -> a)

或者对于这种简单情况,您可以获取 Map [String,AnyVal] ,因为这些值不是复杂的对象

Or you can get Map[String, AnyVal] for this simple case since the values are not complex objects

row.getValuesMap[AnyVal](row.schema.fieldNames)
// res20: Map[String,AnyVal] = Map(A -> 1, B -> 2.0, C -> a)

注意: getValuesMap 的返回值类型可以标记为任何类型,因此您不能依靠它来确定您要使用的数据类型有但需要记住从一开始就拥有的东西.

Note: the returned value type of the getValuesMap can be labelled as any type, so you can not rely on it to figure out what data types you have but need to keep in mind what you have from the beginning instead.

这篇关于在Spark Scala中将Row转换为地图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆