如何使用Spark Sql进行递归查询 [英] How to use Spark Sql to do recursive query
问题描述
我正在尝试使用 spark sql 递归查询分层数据集并识别所有嵌套子项的父根.
I'm trying to use spark sql to recursively query over hierarchal dataset and identifying the parent root of the all the nested children.
我尝试过使用自联接,但它仅适用于 1 个级别.
I've tried using self-join but it only works for 1 level.
有什么想法或建议吗?
谢谢
推荐答案
您可以使用基于 Graphx 的解决方案来执行递归查询(父/子或分层查询).这是许多数据库提供的功能,称为递归公用表表达式 (CTE) 或 SQL 子句连接
You can use a Graphx-based solution to perform a recursive query (parent/child or hierarchical queries) . This is a functionality provided by many databases called Recursive Common Table Expressions (CTE) or Connect by SQL Clause
查看这篇文章了解更多信息:https://www.qubole.com/blog/processing-hierarchical-data-using-spark-graphx-pregel-api/
See this article for more information: https://www.qubole.com/blog/processing-hierarchical-data-using-spark-graphx-pregel-api/
这篇关于如何使用Spark Sql进行递归查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!