SparkSQL并在Java中的DataFrame上爆炸 [英] SparkSQL and explode on DataFrame in Java

查看:64
本文介绍了SparkSQL并在Java中的DataFrame上爆炸的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种简单的方法如何在SparkSQL DataFrame的数组列上使用explode?它在Scala中相对简单,但是在Java中似乎无法使用此功能(如javadoc中所述).

Is there an easy way how use explode on array column on SparkSQL DataFrame? It's relatively simple in Scala, but this function seems to be unavailable (as mentioned in javadoc) in Java.

一种选择是在查询中使用SQLContext.sql(...)explode函数,但我正在寻找一种更好,尤其是更简洁的方法. DataFrame是从镶木地板文件中加载的.

An option is to use SQLContext.sql(...) and explode function inside the query, but I'm looking for a bit better and especially cleaner way. DataFrames are loaded from parquet files.

推荐答案

似乎可以使用org.apache.spark.sql.functions.explode(Column col)DataFrame.withColumn(String colName, Column col)的组合来用其分解版本替换该列.

It seems it is possible to use a combination of org.apache.spark.sql.functions.explode(Column col) and DataFrame.withColumn(String colName, Column col) to replace the column with the exploded version of it.

这篇关于SparkSQL并在Java中的DataFrame上爆炸的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆