嵌入式hadoop-pig：对UDF使用自动addContainingJar的正确方法是什么？ [英] embedded hadoop-pig: what's the correct way to use the automatic addContainingJar for UDFs?

查看：124 发布时间：2018/6/1 12:39:27 hadoop apache-pig

本文介绍了嵌入式hadoop-pig：对UDF使用自动addContainingJar的正确方法是什么？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当你使用pigServer.registerFunction时，你不应该明确地调用pigServer.registerJar，而是让猪使用jarManager.findContainingJar自动检测jar。

然而，我们有一个复杂的UDF，它的类依赖于来自多个罐子的其他类。所以我们用maven-assembly创建了一个jar-with-dependencies。但是这会导致整个jar进入pigContext.skipJars（因为它包含pig.jar本身）并且没有被发送到hadoop服务器：（$ / b>

什么是正确的方法我们必须手动为每个我们依赖的jar调用registerJar吗？

解决方案

不知道什么是认证方式，但这里有一些指针：当使用 pigServer.registerFunction 时，

猪会自动检测到包含udfs并将其发送到jobTracker

pig还会自动检测包含PigMapReduce类（ JarManager.createJar ）的jar，并从它仅以 org / apache / pig ，org / antlr / runtime >等，并将它们发送到jobTracker以及
所以，如果您的UDF与 PigMapReduce 在同一个jar中，因为它不会被发送
我们的结论： don' t使用jar -with-dependencies

HTH when you use pigServer.registerFunction, you're not supposed to explicitly call pigServer.registerJar, but rather have pig automatically detect the jar using jarManager.findContainingJar. However, we have a complex UDF who's class is dependent on other classes from multiple jars. So we created a jar-with-dependencies with the maven-assembly. But this causes the entire jar to enter pigContext.skipJars (as it contains the pig.jar itself) and not being sent to the hadoop server :( What's the correct approach here? Must we manually call registerJar for every jar we depend on? 解决方案 not sure what's the certified way, but here's some pointers: when you use pigServer.registerFunction pig automatically detects the jar that contain the udfs and sends it to the jobTracker pig also automatically detects the jar that contains PigMapReduce class (JarManager.createJar), and extracts from it only the classes that start with org/apache/pig, org/antlr/runtime, etc. and sends them to the jobTracker as well so, if your UDF sits in the same jar as PigMapReduce your'e screwed, because it won't get sent our conclusion: don't use jar-with-dependencies HTH 这篇关于嵌入式hadoop-pig：对UDF使用自动addContainingJar的正确方法是什么？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

嵌入式hadoop-pig：对UDF使用自动addContainingJar的正确方法是什么？ [英] embedded hadoop-pig: what's the correct way to use the automatic addContainingJar for UDFs?

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

嵌入式hadoop-pig：对UDF使用自动addContainingJar的正确方法是什么？ [英] embedded hadoop-pig: what&#39;s the correct way to use the automatic addContainingJar for UDFs?

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

嵌入式hadoop-pig：对UDF使用自动addContainingJar的正确方法是什么？ [英] embedded hadoop-pig: what's the correct way to use the automatic addContainingJar for UDFs?

登录关闭