如何使用子查询和地图查找进行高阶函数转换? [英] How to do higher order function transform with sub query and a map lookup?

查看:22
本文介绍了如何使用子查询和地图查找进行高阶函数转换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我之前的一个后续问题问题

scala>val map1 = spark.sql("select map('s1', 'p1', 's2', 'p2', 's3', 'p3') as lookup")

<块引用>

map1: org.apache.spark.sql.DataFrame = [lookup: map]

scala>val ds1 = spark.sql("select 'p1' as p, Array('s2','s3') as c")

<块引用>

ds1: org.apache.spark.sql.DataFrame = [p: 字符串, c: 数组]

scala>ds1.createOrReplaceTempView("ds1")标度>map1.createOrReplaceTempView("map1")标度>map1.show()+--------------------+|查找|+--------------------+|[p1 ->s1,p2 ->...|+--------------------+标度>ds1.show()+---+--------+||| |+---+--------+|p1|[s2, s3]|+---+--------+map1.selectExpr("element_at(`lookup`, 's2')").first()

<块引用>

res50: org.apache.spark.sql.Row = [p2]

scala>spark.sql("select element_at(`lookup`, 's1') from map1").show()+----------------------+|element_at(查找,s1)|+----------------------+|p1|+----------------------+

到目前为止一切顺利.在接下来的两个步骤中,我遇到了一些问题:

scala>ds1.selectExpr("p", "c", "transform(c, cs -> map1.selectExpr('element_at(`lookup`, cs)')) as cs").show()

<块引用>

20/09/28 19:44:59 警告 HiveConf:名称的 HiveConfhive.stats.jdbc.timeout 不存在 20/09/28 19:44:59 警告HiveConf:名称 hive.stats.retries.wait 的 HiveConf 不存在20/09/28 19:45:03 WARN ObjectStore:版本信息未在元存储.hive.metastore.schema.verification 未启用所以记录模式版本 2.3.0 20/09/28 19:45:03 WARN ObjectStore:调用了 setMetaStoreSchemaVersion 但记录版本被禁用:version = 2.3.0,comment = 由 MetaStore root@10.1.21.76 20/09/28 设置19:45:03 WARN ObjectStore:无法获取数据库 map1,返回NoSuchObjectException org.apache.spark.sql.AnalysisException:未定义的函数:'selectExpr'.这个函数既不是注册的临时功能或注册的永久功能数据库'map1'.第 1 行 pos 19

scala>spark.sql("""select p, c, transform(c, cs -> (select element_at(`lookup`, cs) from map1)) cc from ds1""").show()

<块引用>

org.apache.spark.sql.AnalysisException: 无法解析给定的 'cs'输入列:[map1.lookup];第 1 行 pos 61;'项目 [p#329,c#330,变换(c#330, lambdafunction(scalar-subquery#713 [], lambda cs#715,false)) AS cc#714] : +- '项目[unresolvedalias('element_at(lookup#327, 'cs), None)] : +-SubqueryAlias map1 : +- Project [map(s1, p1, s2, p2, s3, p3) AS查找#327]:+- OneRowRelation+- SubqueryAlias ds1 +- Project [p1 AS p#329, array(s2, s3) AS c#330]+- OneRowRelation

我该如何解决这些问题?

解决方案

只需将表名添加到 from 子句中即可.

spark.sql("""select p, c, transform(c, cs -> element_at(`lookup`, cs)) cc from ds1 a, map1 b"""";).展示()+---+--------+--------+||| |抄|+---+--------+--------+|p1|[s2, s3]|[p2, p3]|+---+--------+--------+

This is a follow up question of my previous question

scala> val map1 = spark.sql("select map('s1', 'p1', 's2', 'p2', 's3', 'p3') as lookup")

map1: org.apache.spark.sql.DataFrame = [lookup: map<string,string>]

scala> val ds1 = spark.sql("select 'p1' as p, Array('s2','s3') as c")

ds1: org.apache.spark.sql.DataFrame = [p: string, c: array]

scala>  ds1.createOrReplaceTempView("ds1")

scala> map1.createOrReplaceTempView("map1")

scala> map1.show()
+--------------------+
|              lookup|
+--------------------+
|[p1 -> s1, p2 -> ...|
+--------------------+


scala> ds1.show()
+---+--------+
|  p|       c|
+---+--------+
| p1|[s2, s3]|
+---+--------+

map1.selectExpr("element_at(`lookup`, 's2')").first()

res50: org.apache.spark.sql.Row = [p2]

scala> spark.sql("select element_at(`lookup`, 's1') from map1").show()
+----------------------+
|element_at(lookup, s1)|
+----------------------+
|                    p1|
+----------------------+

So far so good. In my next two steps I am hitting some issues:

scala> ds1.selectExpr("p", "c", "transform(c, cs -> map1.selectExpr('element_at(`lookup`, cs)')) as cs").show()

20/09/28 19:44:59 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist 20/09/28 19:44:59 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist 20/09/28 19:45:03 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0 20/09/28 19:45:03 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore root@10.1.21.76 20/09/28 19:45:03 WARN ObjectStore: Failed to get database map1, returning NoSuchObjectException org.apache.spark.sql.AnalysisException: Undefined function: 'selectExpr'. This function is neither a registered temporary function nor a permanent function registered in the database 'map1'.; line 1 pos 19

scala> spark.sql("""select p, c, transform(c, cs -> (select element_at(`lookup`, cs) from map1)) cc from ds1""").show()

org.apache.spark.sql.AnalysisException: cannot resolve 'cs' given input columns: [map1.lookup]; line 1 pos 61; 'Project [p#329, c#330, transform(c#330, lambdafunction(scalar-subquery#713 [], lambda cs#715, false)) AS cc#714] : +- 'Project [unresolvedalias('element_at(lookup#327, 'cs), None)] : +- SubqueryAlias map1 : +- Project [map(s1, p1, s2, p2, s3, p3) AS lookup#327] : +- OneRowRelation +- SubqueryAlias ds1 +- Project [p1 AS p#329, array(s2, s3) AS c#330] +- OneRowRelatio

How can I solve these issues?

解决方案

Simply add the table name to the from clauses.

spark.sql("""select p, c, transform(c, cs -> element_at(`lookup`, cs)) cc from ds1 a, map1 b""").show()

+---+--------+--------+
|  p|       c|      cc|
+---+--------+--------+
| p1|[s2, s3]|[p2, p3]|
+---+--------+--------+

这篇关于如何使用子查询和地图查找进行高阶函数转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆