运行相关子查询时引发 sql 错误 [英] spark sql error when running correlated subquery
问题描述
我在 Intellij Maven IDE 中的 spark SQL 查询下运行,
I'm running below spark SQL query in Intellij Maven IDE,
SELECT seq_no,
amount,
(select max(b.amount)
from premium b
where b.seq_no < a.seq_no) last_high_prem
FROM premium a
出现以下错误,
线程main"org.apache.spark.sql.AnalysisException 中的异常:相关标量子查询只能包含相等谓词:(seq_no#11#32 < seq_no#11);
Exception in thread "main" org.apache.spark.sql.AnalysisException: The correlated scalar subquery can only contain equality predicates: (seq_no#11#32 < seq_no#11);
当使用相等运算符时,我理解相关查询,然后目前仅适用于 spark SQL.有什么方法可以解决这个问题.
I understand correlated query when uses equality operator then only works currently in spark SQL. Is there any method to overcome this issue.
我知道我们可以做到 hiveql.但是需要在我的本地机器上设置 hadoop 和 hive.请告诉我如何缓解此问题.
I know we can do it hiveql. But need to setup hadoop and hive in my local machine. Please let me know how to mitigate the issue.
推荐答案
我对 Spark SQL 几乎一无所知,但在我看来,您的问题与相关子查询有关,在大多数情况下,此查询不需要SQL的品牌.Spark 确实接受 max
函数作为 窗口函数.
I know next to nothing about Spark SQL, but it seems to me your issue is with the correlated subquery, which wouldn't be necessary for this query in most brands of SQL. Spark does accept the max
function as a Window Function.
你能做到吗:
SELECT seq_no,
amount,
max(amount) OVER (ORDER BY seq_no ROWS BETWEEN UNBOUNDED PRECEDING and 1 PRECEDING) AS last_high_prem
FROM premium
注意:您可能还需要一个 partition by
短语,但不是针对您提供的确切查询.
Note: you probably also need a partition by
phrase, but not for the exact query you've presented.
这篇关于运行相关子查询时引发 sql 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!