运行相关子查询时引发 sql 错误 [英] spark sql error when running correlated subquery

查看:30
本文介绍了运行相关子查询时引发 sql 错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Intellij Maven IDE 中的 spark SQL 查询下运行,

I'm running below spark SQL query in Intellij Maven IDE,

SELECT seq_no, 
       amount, 
       (select max(b.amount) 
        from premium b 
        where b.seq_no < a.seq_no) last_high_prem
FROM premium a

出现以下错误,

线程main"org.apache.spark.sql.AnalysisException 中的异常:相关标量子查询只能包含相等谓词:(seq_no#11#32 < seq_no#11);

Exception in thread "main" org.apache.spark.sql.AnalysisException: The correlated scalar subquery can only contain equality predicates: (seq_no#11#32 < seq_no#11);

当使用相等运算符时,我理解相关查询,然后目前仅适用于 spark SQL.有什么方法可以解决这个问题.

I understand correlated query when uses equality operator then only works currently in spark SQL. Is there any method to overcome this issue.

我知道我们可以做到 hiveql.但是需要在我的本地机器上设置 hadoop 和 hive.请告诉我如何缓解此问题.

I know we can do it hiveql. But need to setup hadoop and hive in my local machine. Please let me know how to mitigate the issue.

推荐答案

我对 Spark SQL 几乎一无所知,但在我看来,您的问题与相关子查询有关,在大多数情况下,此查询不需要SQL的品牌.Spark 确实接受 max 函数作为 窗口函数.

I know next to nothing about Spark SQL, but it seems to me your issue is with the correlated subquery, which wouldn't be necessary for this query in most brands of SQL. Spark does accept the max function as a Window Function.

你能做到吗:

SELECT seq_no, 
       amount, 
       max(amount) OVER (ORDER BY seq_no ROWS BETWEEN UNBOUNDED PRECEDING and 1 PRECEDING) AS last_high_prem
FROM premium 

注意:您可能还需要一个 partition by 短语,但不是针对您提供的确切查询.

Note: you probably also need a partition by phrase, but not for the exact query you've presented.

这篇关于运行相关子查询时引发 sql 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆