在Apache Spark Databricks上的Scala笔记本中,如何正确地将数组强制转换为类型为October(30,0)的类型? [英] In a Scala notebook on Apache Spark Databricks how do you correctly cast an array to type decimal(30,0)?

查看:256
本文介绍了在Apache Spark Databricks上的Scala笔记本中,如何正确地将数组强制转换为类型为October(30,0)的类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将数组强制转换为Decimal(30,0),以便动态地在select中使用:

I am trying to cast an array as Decimal(30,0) for use in a select dynamically as:

WHERE array_contains(myArrayUDF(), someTable.someColumn)

但是,当使用以下方法进行投射时:

However when casting with:

val arrIds = someData.select("id").withColumn("id", col("id")
                .cast(DecimalType(30, 0))).collect().map(_.getDecimal(0))

Databricks接受并签名,但是看起来已经是错误的:intArrSurrIds:Array [java.math.BigDecimal] = Array(2181890000000,...)//即,一个BigDecimal

Databricks accepts that and signature however already looks wrong to be: intArrSurrIds: Array[java.math.BigDecimal] = Array(2181890000000,...) // ie, a BigDecimal

这将导致以下错误:

SQL语句中的错误:AnalysisException:由于数据类型不匹配而无法解析:输入到array_contains函数的数组应该是跟在后面的元素类型相同的值,但是它是[array< decimal(38,18)>,十进制(30,0)]

Error in SQL statement: AnalysisException: cannot resolve.. due to data type mismatch: Input to function array_contains should have been array followed by a value with same element type, but it's [array<decimal(38,18)>, decimal(30,0)]

如何在Spark Databricks Scala笔记本中正确转换为十进制(30,0)而不是十进制(38,18)?

How do you correctly cast as decimal(30,0) in Spark Databricks Scala notebook instead of decimal(38,18) ?

任何帮助表示赞赏!

推荐答案

您可以使用以下代码将 arrIds 设置为 Array [Decimal] :

You can make arrIds an Array[Decimal] using the code below:

import org.apache.spark.sql.functions.col
import org.apache.spark.sql.types.{Decimal, DecimalType}

val arrIds = someData.select("id")
  .withColumn("id", col("id").cast(DecimalType(30, 0)))
  .collect()
  .map(row => Decimal(row.getDecimal(0), 30, 0))

但是,它不能解决您的问题,因为一旦您创建了用户定义的函数,就会失去精度和规模,这个答案

However, it will not solve your problem because you lose the precision and scale once you create your user defined function, as I explain in this answer

要解决您的问题,您需要将列 someTable.someColumn 转换为与UDF返回类型相同的精度和小数位数的Decimal.因此,您的 WHERE 子句应为:

To solve your problem, you need to cast the column someTable.someColumn to Decimal with the same precision and scale than the UDF returned type. So your WHERE clause should be:

WHERE array_contains(myArray(), cast(someTable.someColumn as Decimal(38, 18)))

这篇关于在Apache Spark Databricks上的Scala笔记本中,如何正确地将数组强制转换为类型为October(30,0)的类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆