如何使用Spark Sql进行递归查询 [英] How to use Spark Sql to do recursive query
问题描述
我正在尝试使用spark sql递归查询层次数据集并标识所有嵌套子级的父级根.
I'm trying to use spark sql to recursively query over hierarchal dataset and identifying the parent root of the all the nested children.
我尝试使用自连接,但仅适用于1级.
I've tried using self-join but it only works for 1 level.
有什么想法或建议吗?
谢谢
推荐答案
您可以使用基于Graphx的解决方案来执行递归查询(父/子或层次查询).这是许多数据库提供的功能,称为递归公用表表达式(CTE)或SQL语句连接
You can use a Graphx-based solution to perform a recursive query (parent/child or hierarchical queries) . This is a functionality provided by many databases called Recursive Common Table Expressions (CTE) or Connect by SQL Clause
有关更多信息,请参见本文: https://www.qubole.com/blog/processing-hierarchical-data-using-spark-graphx-pregel-api/
See this article for more information: https://www.qubole.com/blog/processing-hierarchical-data-using-spark-graphx-pregel-api/
这篇关于如何使用Spark Sql进行递归查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!