Spark 是否支持子查询? [英] Does Spark support subqqueries?
问题描述
当我运行这个查询时,我遇到了这种类型的错误
select * from raw_2 where ip NOT IN (select * from raw_1);
<块引用>
org.apache.spark.sql.AnalysisException:
查询中不支持的语言功能:
select * from raw_2 where ip NOT IN (select * from raw_1)TOK_QUERY 1, 0,24, 14TOK_FROM 1, 4,6, 14TOK_TABREF 1, 6,6, 14TOK_TABNAME 1、6、6、14raw_2 1, 6,6, 14TOK_INSERT 0, -1,24, 0TOK_DESTINATION 0, -1,-1, 0TOK_DIR 0, -1,-1, 0TOK_TMP_FILE 0, -1,-1, 0TOK_SELECT 0, 0,2, 0TOK_SELEXPR 0, 2,2, 0TOK_ALLCOLREF 0, 2,2, 0TOK_WHERE 1, 8,24, 29不是 1、10、24、29TOK_SUBQUERY_EXPR 1、14、10、33TOK_SUBQUERY_OP 1、14、14、331、14、14、33TOK_QUERY 1, 16,24, 51TOK_FROM 1, 21,23, 51TOK_TABREF 1、23、23、51TOK_TABNAME 1、23、23、51raw_1 1、23、23、51TOK_INSERT 0, -1,19, 0TOK_DESTINATION 0, -1,-1, 0TOK_DIR 0, -1,-1, 0TOK_TMP_FILE 0, -1,-1, 0TOK_SELECT 0, 17,19, 0TOK_SELEXPR 0, 19,19, 0TOK_ALLCOLREF 0, 19,19, 0TOK_TABLE_OR_COL 1、10、10、26ip 1, 10, 10, 26
<块引用>
scala.NotImplementedError:ASTNode 类型没有解析规则:817,文本:
TOK_SUBQUERY_EXPR :TOK_SUBQUERY_EXPR 1、14、10、33TOK_SUBQUERY_OP 1、14、14、331、14、14、33TOK_QUERY 1, 16,24, 51TOK_FROM 1, 21,23, 51托克_
Spark 2.0.0+:
自 2.0.0 Spark 支持全范围的子查询.有关详细信息,请参阅SparkSQL 是否支持子查询?.
火花<2.0.0
<块引用>Spark 是否支持子查询?
一般来说确实如此.像 SELECT * FROM (SELECT * FROM foo WHERE bar = 1) as tmp
这样的结构是 Spark SQL 中完全有效的查询.
据我所知 Catalyst 解析器源 它不支持 NOT IN
子句中的内部查询:
<代码>|termExpression ~ (NOT ~ IN ~ "(" ~> rep1sep(termExpression, ",")) <~ ")" ^^ {情况 e1 ~ e2 =>不(在(e1,e2))}
仍然可以使用外连接后接过滤器来获得相同的效果.
When I am running this query i got this type of error
select * from raw_2 where ip NOT IN (select * from raw_1);
org.apache.spark.sql.AnalysisException:
Unsupported language features in query:
select * from raw_2 where ip NOT IN (select * from raw_1)
TOK_QUERY 1, 0,24, 14
TOK_FROM 1, 4,6, 14
TOK_TABREF 1, 6,6, 14
TOK_TABNAME 1, 6,6, 14
raw_2 1, 6,6, 14
TOK_INSERT 0, -1,24, 0
TOK_DESTINATION 0, -1,-1, 0
TOK_DIR 0, -1,-1, 0
TOK_TMP_FILE 0, -1,-1, 0
TOK_SELECT 0, 0,2, 0
TOK_SELEXPR 0, 2,2, 0
TOK_ALLCOLREF 0, 2,2, 0
TOK_WHERE 1, 8,24, 29
NOT 1, 10,24, 29
TOK_SUBQUERY_EXPR 1, 14,10, 33
TOK_SUBQUERY_OP 1, 14,14, 33
IN 1, 14,14, 33
TOK_QUERY 1, 16,24, 51
TOK_FROM 1, 21,23, 51
TOK_TABREF 1, 23,23, 51
TOK_TABNAME 1, 23,23, 51
raw_1 1, 23,23, 51
TOK_INSERT 0, -1,19, 0
TOK_DESTINATION 0, -1,-1, 0
TOK_DIR 0, -1,-1, 0
TOK_TMP_FILE 0, -1,-1, 0
TOK_SELECT 0, 17,19, 0
TOK_SELEXPR 0, 19,19, 0
TOK_ALLCOLREF 0, 19,19, 0
TOK_TABLE_OR_COL 1, 10,10, 26
ip 1, 10,10, 26
scala.NotImplementedError: No parse rules for ASTNode type: 817, text:
TOK_SUBQUERY_EXPR :
TOK_SUBQUERY_EXPR 1, 14,10, 33
TOK_SUBQUERY_OP 1, 14,14, 33
IN 1, 14,14, 33
TOK_QUERY 1, 16,24, 51
TOK_FROM 1, 21,23, 51
TOK_
Spark 2.0.0+:
since 2.0.0 Spark supports a full range of subqueries. See Does SparkSQL support subquery? for details.
Spark < 2.0.0
Does Spark support subqqueries?
Generally speaking it does. Constructs like SELECT * FROM (SELECT * FROM foo WHERE bar = 1) as tmp
perfectly valid queries in the Spark SQL.
As far as I can tell from the Catalyst parser source it doesn't support inner queries in a NOT IN
clause:
| termExpression ~ (NOT ~ IN ~ "(" ~> rep1sep(termExpression, ",")) <~ ")" ^^ {
case e1 ~ e2 => Not(In(e1, e2))
}
It is still possible to use outer join followed by filter to obtain the same effect.
这篇关于Spark 是否支持子查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!