无法在 Spark SQL 中生成 UUID [英] Unable to generate UUIDs in Spark SQL

查看：59 发布时间：2021/11/14 22:12:01 apache-spark cassandra apache-spark-sql spark-cassandra-connector

本文介绍了无法在 Spark SQL 中生成 UUID的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

<块引用>

下面是代码块和收到的错误

<代码>>创建临时视图sqlcontext.sql("""创建临时视图 temp_pay_txn_stage使用 org.apache.spark.sql.cassandra选项 (表t_pay_txn_stage"，键空间ks_pay"，集群测试集群"，下推真")""".stripMargin)sqlcontext.sql("""创建临时视图 temp_pay_txn_source使用 org.apache.spark.sql.cassandra选项 (表t_pay_txn_source"，键空间ks_pay"，集群测试集群"，下推真")""".stripMargin)

<块引用>

查询如下视图，以便能够从源中不存在的阶段获取新记录.

Scala>val df_newrecords = sqlcontext.sql("""选择 UUID(),||stage.order_id,||stage.order_description,||stage.transaction_id,||stage.pre_transaction_freeze_balance,||stage.post_transaction_freeze_balance,||toTimestamp(now()),||空，||1 从 temp_pay_txn_stage 阶段离开加入 temp_pay_txn_source 源在 stage.order_id=source.order_id 和 stage.transaction_id=source.transaction_id 其中||source.order_id 为空，而 source.transaction_id 为空""")`org.apache.spark.sql.AnalysisException:未定义的函数:'uuid()'.此函数既不是已注册的临时函数，也不是在数据库default"中注册的永久函数.线 1 位置 7

我正在尝试生成 UUID，但收到此错误.

解决方案

这是一个简单的例子如何生成 timeuuid :

import org.apache.spark.sql.SQLContextval sqlcontext = 新的 SQLContext(sc)导入 sqlcontext.implicits._//导入包含timeBased()方法的UUID导入 com.datastax.driver.core.utils.UUIDs//用户定义函数timeUUID，它将重新运行基于时间的uuidval timeUUID = udf(() => UUIDs.timeBased().toString)//要测试的示例查询，您可以将其更改为您的val df_newrecords = sqlcontext.sql("SELECT 1 as data UNION SELECT 2 as data").withColumn("time_uuid", timeUUID())//打印所有行df_newrecords.collect().foreach(println)

输出:

[1,9a81b3c0-170b-11e7-98bf-9bb55f3128dd][2,9a831350-170b-11e7-98bf-9bb55f3128dd]

来源:https://stackoverflow.com/a/37232099/2320144 https://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/utils/UUIDs.html#timeBased--

below is the code block and the error recieved

> creating a temporary views 
sqlcontext.sql("""CREATE TEMPORARY VIEW temp_pay_txn_stage
     USING org.apache.spark.sql.cassandra
     OPTIONS (
       table "t_pay_txn_stage",
       keyspace "ks_pay",
       cluster "Test Cluster",
       pushdown "true"
     )""".stripMargin)

sqlcontext.sql("""CREATE TEMPORARY VIEW temp_pay_txn_source
     USING org.apache.spark.sql.cassandra
     OPTIONS (
       table "t_pay_txn_source",
       keyspace "ks_pay",
       cluster "Test Cluster",
       pushdown "true"
     )""".stripMargin)

querying the views as below to be able to get new records from stage not present in source .

Scala> val df_newrecords = sqlcontext.sql("""Select UUID(),
 | |stage.order_id,
 | |stage.order_description,
 | |stage.transaction_id,
 | |stage.pre_transaction_freeze_balance,
 | |stage.post_transaction_freeze_balance,
 | |toTimestamp(now()),
 | |NULL,
 | |1 from temp_pay_txn_stage  stage left join temp_pay_txn_source source on stage.order_id=source.order_id and stage.transaction_id=source.transaction_id where
 | |source.order_id is null and source.transaction_id is null""")`



org.apache.spark.sql.AnalysisException: Undefined function: 'uuid()'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 7

i am trying to get the UUIDs generated , but getting this error.

解决方案

Here is a Simple Example How you can generate timeuuid :

import org.apache.spark.sql.SQLContext    
val sqlcontext = new SQLContext(sc)

import sqlcontext.implicits._

//Import UUIDs that contains the method timeBased()
import com.datastax.driver.core.utils.UUIDs

//user define function timeUUID  which will retrun time based uuid      
val timeUUID = udf(() => UUIDs.timeBased().toString)

//sample query to test, you can change it to yours
val df_newrecords = sqlcontext.sql("SELECT 1 as data UNION SELECT 2 as data").withColumn("time_uuid", timeUUID())

//print all the rows
df_newrecords.collect().foreach(println)

Output :

[1,9a81b3c0-170b-11e7-98bf-9bb55f3128dd]
[2,9a831350-170b-11e7-98bf-9bb55f3128dd]

Source : https://stackoverflow.com/a/37232099/2320144 https://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/utils/UUIDs.html#timeBased--

这篇关于无法在 Spark SQL 中生成 UUID的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

无法在 Spark SQL 中生成 UUID [英] Unable to generate UUIDs in Spark SQL

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

无法在 Spark SQL 中生成 UUID [英] Unable to generate UUIDs in Spark SQL

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭