有关LoadFunc的示例和更多解释 [英] Example and more explanation about LoadFunc

查看:230
本文介绍了有关LoadFunc的示例和更多解释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在哪里可以找到更多有关LoadFunc的信息/示例。
http://web.archive.org/web/20130701024312/http://ofps.oreilly.com/titles/9781449302641/load_and_store_funcs.html
我没有看到使用新LoadFunc API的例子。
任何人都可以让我知道在哪里可以找到一些写UDF的例子吗?

解决方案

截至0.7。 0,Pig Loaders扩展了LoadFunc抽象类。这意味着它们需要重写4个方法:


  • getInputFormat()此方法返回给调用者一个加载器支持的InputFormat实例。实际的加载过程需要一个在加载时使用的实例,并且不希望对该实例的创建方式施加任何限制。

  • prepareToRead()在阅读拆分之前被调用。它传递读取拆分期间使用的读取器,以及实际拆分。
  • setLocation()Pig调用这个函数来传递加载位置加载器负责将该信息传递给底层的InputFormat对象。这个方法可以被多次调用,所以应该没有与该方法相关的状态(除非该方法被调用时该状态被重置)。
  • getNext ()猪一旦完成所有设置,就调用它来从加载器获取下一个元组。如果此方法返回NULL,Pig假定已经处理了通过prepareToRead()方法传递的分割中的所有信息。


    下面是一些很好的文章,可以为Pig编写自定义加载函数:




    Where can I find more information/example about LoadFunc. Except for the http://web.archive.org/web/20130701024312/http://ofps.oreilly.com/titles/9781449302641/load_and_store_funcs.html I dont see any examples that use the new LoadFunc APis. Can anyone please let me know where I can find some example for writing Load UDF?

    解决方案

    As of 0.7.0, Pig loaders extend the LoadFunc abstract class.This means they need to override 4 methods:

    • getInputFormat() this method returns to the caller an instance of the InputFormat that the loader supports. The actual load process needs an instance to use at load time, and doesn't want to place any constraints on how that instance is created.

    • prepareToRead() is called prior to reading a split. It passes in the reader used during the reads of the split, as well as the actual split. The implementation of the loader usually keeps the reader, and may want to access the actual split if needed.

    • setLocation() Pig calls this to communicate the load location to the loader, which is responsible for passing that information to the underlying InputFormat object. This method can be called multiple times, so there should be no state associated with the method (unless that state gets reset when the method is called).

    • getNext() Pig calls this to get the next tuple from the loader once all setup has been done. If this method returns a NULL, Pig assumes that all information in the split passed via the prepareToRead() method has been processed.

    Here are a few nice articles to write Custom Load Function for Pig:

    这篇关于有关LoadFunc的示例和更多解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆