自定义udf的配置单元查询执行正在考虑使用Oozie Flow在CDH4中使用hdfs jar路径而不是本地路径 [英] Hive query execution for custom udf is exepecting hdfs jar path instead of local path in CDH4 with Oozie flow

查看：153 发布时间：2020/5/20 18:33:07 oozie

本文介绍了自定义udf的配置单元查询执行正在考虑使用Oozie Flow在CDH4中使用hdfs jar路径而不是本地路径的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们正在从CDH3迁移到CDH4，作为此迁移的一部分，我们正在转移CDH3上的所有工作.我们注意到了一个关键问题，当通过oozie执行工作流以执行内部调用hive查询(hive -e {query})的python脚本时，在此hive查询中，我们使用add添加了一个自定义jar jar {LOCAL PATH FOR JAR}，并为自定义udf创建了一个临时函数.直到这里看起来还可以.但是，当查询开始使用自定义udf函数执行时，由于分布式缓存，文件未找到异常而失败，该异常正在HDFS路径中查找jar，而不是在本地路径中查找lookig.

We are migrating from CDH3 to CDH4 and as part of this migration we are moving all the jobs that we have on CDH3. We have noticed one critical issue in this, when a work flow is executed through oozie for executing a python script which internally invoked a hive query(hive -e {query}), here in this hive query we are adding a custom jar using add jar {LOCAL PATH FOR JAR}, and created a temporary function for custom udf. And it looks ok till here. But when the query started executing with custom udf funtion it is failing with Distributed cache, File Not Found Exception which is looking for jar in the HDFS path instead of lookig in local path.

我不确定我是否在这里缺少一些配置.

I am not sure if I am missing some configuration here.

执行跟踪:

警告:不建议使用org.apache.hadoop.metrics.jvm.EventCounter. 请在所有 log4j.properties文件.执行日志位于: /tmp/yarn/yarn_20131107020505_79b41443-b9f4-4d36-a0eb-4f0d79cd3ce9.log java.io.FileNotFoundException:文件不存在: hdfs://aa.bb.com:8020/opt/nfsmount/mypath/custom.jar 在org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:824) 在org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) 在org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) 在org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93) ..... .....

WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Execution log at: /tmp/yarn/yarn_20131107020505_79b41443-b9f4-4d36-a0eb-4f0d79cd3ce9.log java.io.FileNotFoundException: File does not exist: hdfs://aa.bb.com:8020/opt/nfsmount/mypath/custom.jar at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:824) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93) ..... .....

对此有任何帮助，我们深表感谢.

any help on this is highly appreciated.

关于， GHK.

自定义udf的配置单元查询执行正在考虑使用Oozie Flow在CDH4中使用hdfs jar路径而不是本地路径 [英] Hive query execution for custom udf is exepecting hdfs jar path instead of local path in CDH4 with Oozie flow

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

自定义udf的配置单元查询执行正在考虑使用Oozie Flow在CDH4中使用hdfs jar路径而不是本地路径 [英] Hive query execution for custom udf is exepecting hdfs jar path instead of local path in CDH4 with Oozie flow

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭