Spark 2.0 中的全阶段代码生成 [英] Whole-Stage Code Generation in Spark 2.0

查看:15
本文介绍了Spark 2.0 中的全阶段代码生成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我听说过 Whole-Stage Code Generation 用于 sql 优化查询.通过

对所有这些的一个很好的参考是博客帖子

HTH!

I heard about Whole-Stage Code Generation for sql to optimize queries. through p539-neumann.pdf & sparksql-sql-codegen-is-not-giving-any-improvemnt

But unfortunately no one gave answer to above question.

Curious to know about what are the scenarios to use this feature of Spark 2.0. But didn't get proper use-case after googling.

Whenever we are using sql, can we use this feature? if so, any proper use case to see this working?

解决方案

When you are using Spark 2.0, code generation is enabled by default. This allows for most DataFrame queries you are able to take advantage of the performance improvements. There are some potential exceptions such as using Python UDFs that may slow things down.

Code generation is one of the primary components of the Spark SQL engine's Catalyst Optimizer. In brief, the Catalyst Optimizer engine does the following: (1) analyzing a logical plan to resolve references, (2) logical plan optimization (3) physical planning, and (4) code generation

A great reference to all of this are the blog posts

HTH!

这篇关于Spark 2.0 中的全阶段代码生成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆