Spark DataFrame向JDBC写入-无法获取array< array< int>>的JDBC类型. [英] Spark DataFrame write to JDBC - Can't get JDBC type for array<array<int>>
问题描述
我正在尝试通过JDBC保存数据帧(到postgres).字段之一是Array[Array[Int]]
类型.如果不进行任何强制转换,它将失败并显示
I'm trying to save a dataframe via JDBC (to postgres). One of the fields is of type Array[Array[Int]]
. Without any casting, it fails with
Exception in thread "main" java.lang.IllegalArgumentException: Can't
get JDBC type for array<array<int>>
at ... (JdbcUtils.scala:148)
我在数组数据类型中添加了显式转换以指导转换:
I added explicit casting to the array datatype to guide the transformation:
val df = readings
.map { case ((a, b), (_, d, e, arrayArrayInt)) => (a, b, d, e, arrayArrayInt) }
.toDF("A", "B", "D", "E", "arrays")
edgesDF
.withColumn("arrays_", edgesDF.col("arrays").cast(ArrayType(ArrayType(IntegerType))))
.drop("arrays")
.withColumnRenamed("arrays_", "arrays")
.write
.mode(SaveMode.ErrorIfExists)
.jdbc(url = dbURLWithSchema, table = "mytable", connectionProperties = dbProps)
但是它仍然会失败,并且具有相同的异常.
But it still fails with the same exception.
如何获取这些数据以保存到数据库?
How can I get this data to persist to DB?
推荐答案
您可以将array<array<int>>
存储在数据库中,它不支持将数据类型作为数组
You can store array<array<int>>
in database, it doesn't supports datatype as array
一种选择是通过使用简单的udf
如下所示使带有定界符的单个字符串
One option is to make a single string with delimiter by using a simple udf
as below
import org.apache.spark.sql.functions._
val arrToString = udf((value: Seq[Seq[Int]]) => {
value.map(x=> x.map(_.toString).mkString(",")).mkString("::")
})
// this udf creates array<array<int>> to string as 1,2,3::3,4,5::6,7
df.withColumn("eventTime", arrToString($"eventtime"))
这对您有帮助!
这篇关于Spark DataFrame向JDBC写入-无法获取array< array< int>>的JDBC类型.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!