Hive执行钩子 [英] Hive execution hook

查看：175 发布时间：2018/5/31 18:51:03 hadoop hive bigdata cloudera

本文介绍了Hive执行钩子的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在Apache Hive中挂接一个自定义执行钩子。请让我知道如果有人知道如何做到这一点。

当前使用的环境如下：

Hadoop：Cloudera 4.1.2版本
操作系统：Centos

感谢，
Arun
解决方案
<有几种类型的钩子，取决于你想在哪个阶段注入自定义代码：

驱动程序运行钩子（Pre / Post ）

语义分析器钩子（前/后）

执行钩子（前/失败/后期） >客户统计发布者

如果您运行脚本，处理流程如下所示：

Driver.run（）接收命令

HiveDriverRunHook.preDriverRun（） <
（ HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS ）

Driver.compile（）开始处理命令：创建摘要语法树

AbstractSemanticAnalyzerHook.preAnalyze（）

（HiveCon语义分析
AbstractSemanticAnalyzerHook.postAnalyze（）
code> （ HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK ）创建并验证查询计划物理计划） Driver.execute（）：准备好运行作业 ExecuteWithHookContext.run（）（ HiveConf.ConfVars.PREEXECHOOKS ） ExecDriver.execute（）作业对于每个HiveConf.ConfVars.HIVECOUNTERSPULLINTERVAL区间的每个作业： ClientStatsPublisher.run（）是调用以发布（ HiveConf.ConfVars.CLIENTSTATSPUBLISHERS ）如果任务失败： ExecuteWithHookContext .run（）（ HiveConf.ConfVars.ONFAILUREHOOKS ）完成所有任务 ExecuteWithHookContext.run（）（ HiveConf.ConfVars.POSTEXECHOOKS 在返回结果 HiveDriverRunHook.postDriverRun（）（ HiveConf.ConfVars .HIVE_DRIVER_RUN_HOOKS ）返回结果。
我指出了你必须实现的接口。在括号中有相应的conf。支柱。键必须设置为在脚本的开头注册类。例如：设置PreExecution钩子（工作流程的第9个阶段）
HiveConf.ConfVars.PREEXECHOOKS - > hive.exec.pre.hooks： set hive.exec.pre.hooks = com.example.MyPreHook;
不幸的是，这些功能没有真正记录，但您可以随时查看 Driver class看看评价顺序的钩子。
备注：我在这里假设Hive 0.11.0，我不认为Cloudera的分布不同很多） I am in need to hook a custom execution hook in Apache Hive. Please let me know if somebody know how to do it. The current environment I am using is given below: Hadoop : Cloudera version 4.1.2 Operating system : Centos Thanks, Arun 解决方案 There are several types of hooks depending on at which stage you want to inject your custom code: Driver run hooks (Pre/Post) Semantic analyizer hooks (Pre/Post) Execution hooks (Pre/Failure/Post) Client statistics publisher If you run a script the processing flow looks like as follows: Driver.run() takes the command HiveDriverRunHook.preDriverRun() (HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS) Driver.compile() starts processing the command: creates the abstract syntax tree AbstractSemanticAnalyzerHook.preAnalyze() (HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK) Semantic analysis AbstractSemanticAnalyzerHook.postAnalyze() (HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK) Create and validate the query plan (physical plan) Driver.execute() : ready to run the jobs ExecuteWithHookContext.run() (HiveConf.ConfVars.PREEXECHOOKS) ExecDriver.execute() runs all the jobs For each job at every HiveConf.ConfVars.HIVECOUNTERSPULLINTERVAL interval: ClientStatsPublisher.run() is called to publish statistics (HiveConf.ConfVars.CLIENTSTATSPUBLISHERS) If a task fails: ExecuteWithHookContext.run() (HiveConf.ConfVars.ONFAILUREHOOKS) Finish all the tasks ExecuteWithHookContext.run() (HiveConf.ConfVars.POSTEXECHOOKS) Before returning the result HiveDriverRunHook.postDriverRun() ( HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS) Return the result. For each of the hooks I indicated the interfaces you have to implement. In the brackets there's the corresponding conf. prop. key you have to set in order to register the class at the beginning of the script. E.g: setting the PreExecution hook (9th stage of the workflow) HiveConf.ConfVars.PREEXECHOOKS -> hive.exec.pre.hooks : set hive.exec.pre.hooks=com.example.MyPreHook; Unfortunately these features aren't really documented, but you can always look into the Driver class to see the evaluation order of the hooks. Remark: I assumed here Hive 0.11.0, I don't think that the Cloudera distribution differs (too much) 这篇关于Hive执行钩子的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Hive执行钩子 [英] Hive execution hook

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

Hive执行钩子 [英] Hive execution hook

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭