使用数据类型 map<string,bigint> 将数据帧写入 csv在火花 [英] Write dataframe to csv with datatype map<string,bigint> in Spark

查看：50 发布时间：2021/11/14 23:12:01 apache-spark apache-spark-sql rdd

本文介绍了使用数据类型 map<string,bigint> 将数据帧写入 csv在火花的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个文件 file1snappy.parquet.它有一个复杂的数据结构，比如地图，里面的数组.处理后我得到了最终结果.在将结果写入 csv 时，我收到一些错误说

I have a file which is file1snappy.parquet. It is having a complex data structure like a map, array inside that.After processing that I got final result.while writing that results to csv I am getting some error saying

"Exception in thread "main" java.lang.UnsupportedOperationException: CSV data source does not support map<string,bigint> data type."

我使用过的代码:

val conf=new SparkConf().setAppName("student-example").setMaster("local")
    val sc = new SparkContext(conf)
    val sqlcontext = new org.apache.spark.sql.SQLContext(sc)
    val datadf = sqlcontext.read.parquet("C:\\file1.snappy.parquet")
    def sumaggr=udf((aggr: Map[String, collection.mutable.WrappedArray[Long]]) => if (aggr.keySet.contains("aggr")) aggr("aggr").sum else 0)
datadf.select(col("neid"),sumaggr(col("marks")).as("sum")).filter(col("sum") =!= 0).show(false)
    datadf.write.format("com.databricks.spark.csv").option("header", "true").save("C:\\myfile.csv")

我尝试转换 datadf.toString() 但仍然面临同样的问题.如何将该结果写入 CSV.

I tried converting datadf.toString() but still I am facing same issue. How can write that result to CSV.

spark 2.1.1 版

spark version 2.1.1

推荐答案

Spark CSV source 仅支持原子类型.您不能存储任何非原子列

Spark CSV source supports only atomic types. You cannot store any columns that are non-atomic

我认为最好为具有 map 作为数据类型的列创建一个 JSON，并将其保存在 csv 中，如下所示.

I think best is to create a JSON for the column that has map<string,bigint> as a datatype and save it in csv as below.

import spark.implicits._ 
import org.apache.spark.sql.functions._

datadf.withColumn("column_name_with_map_type", to_json(struct($"column_name_with_map_type"))).write.csv("outputpath")

希望这会有所帮助！

这篇关于使用数据类型 map<string,bigint> 将数据帧写入 csv在火花的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用数据类型 map<string,bigint> 将数据帧写入 csv在火花 [英] Write dataframe to csv with datatype map<string,bigint> in Spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用数据类型 map<string,bigint> 将数据帧写入 csv在火花 [英] Write dataframe to csv with datatype map&lt;string,bigint&gt; in Spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

使用数据类型 map<string,bigint> 将数据帧写入 csv在火花 [英] Write dataframe to csv with datatype map<string,bigint> in Spark

登录关闭