在 Apache Spark Databricks 上的 Scala 笔记本中,您如何正确地将数组转换为 decimal(30,0) 类型? [英] In a Scala notebook on Apache Spark Databricks how do you correctly cast an array to type decimal(30,0)?
问题描述
我正在尝试将数组转换为 Decimal(30,0) 以用于动态选择:
I am trying to cast an array as Decimal(30,0) for use in a select dynamically as:
WHERE array_contains(myArrayUDF(), someTable.someColumn)
但是当使用:
val arrIds = someData.select("id").withColumn("id", col("id")
.cast(DecimalType(30, 0))).collect().map(_.getDecimal(0))
Databricks 接受这一点,但签名看起来已经是错误的:intArrSurrIds: Array[java.math.BigDecimal] = Array(2181890000000,...)//即一个 BigDecimal
Databricks accepts that and signature however already looks wrong to be: intArrSurrIds: Array[java.math.BigDecimal] = Array(2181890000000,...) // ie, a BigDecimal
导致以下错误:
SQL 语句中的错误:AnalysisException:无法解析..由于数据类型不匹配:函数 array_contains 的输入应该是数组,后跟元素类型相同的值,但它是 [array
Error in SQL statement: AnalysisException: cannot resolve.. due to data type mismatch: Input to function array_contains should have been array followed by a value with same element type, but it's [array<decimal(38,18)>, decimal(30,0)]
如何在 Spark Databricks Scala notebook 中正确转换为 decimal(30,0) 而不是 decimal(38,18) ?
How do you correctly cast as decimal(30,0) in Spark Databricks Scala notebook instead of decimal(38,18) ?
感谢任何帮助!
推荐答案
您可以使用以下代码使 arrIds
成为 Array[Decimal]
:
You can make arrIds
an Array[Decimal]
using the code below:
import org.apache.spark.sql.functions.col
import org.apache.spark.sql.types.{Decimal, DecimalType}
val arrIds = someData.select("id")
.withColumn("id", col("id").cast(DecimalType(30, 0)))
.collect()
.map(row => Decimal(row.getDecimal(0), 30, 0))
但是,它不会解决您的问题,因为一旦您创建了用户定义的函数,您就会失去精度和规模,正如我在这个答案
However, it will not solve your problem because you lose the precision and scale once you create your user defined function, as I explain in this answer
要解决您的问题,您需要将列 someTable.someColumn
转换为 Decimal,其精度和比例与 UDF 返回的类型相同.所以你的 WHERE
子句应该是:
To solve your problem, you need to cast the column someTable.someColumn
to Decimal with the same precision and scale than the UDF returned type. So your WHERE
clause should be:
WHERE array_contains(myArray(), cast(someTable.someColumn as Decimal(38, 18)))
这篇关于在 Apache Spark Databricks 上的 Scala 笔记本中,您如何正确地将数组转换为 decimal(30,0) 类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!