Pig 3rd 方 UDF 澄清 [英] Pig 3rd party UDF clarification

查看:29
本文介绍了Pig 3rd 方 UDF 澄清的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 PIG 的新手.从pig wiki 页面我了解到有piggybank udf 和另一个来自Linkedin 的有用集合DataFu.我也知道从 Pig 0.8 开始,存钱罐是 apache Pig 内置 udfs 的一部分.

I am new to PIG. From the pig wiki page i got to know that there is piggybank udf and another useful collection DataFu from Linkedin. Also i come to know that from Pig 0.8 the piggybank is part of apache Pig's builtin udfs.

但是……我认为大多数存钱罐 UDF 都没有在 Apache Pig 中记录.就像 StringConcat 一样.

but.. I think most of the piggybank UDFs are not documented in Apache Pig. Like StringConcat.

我正在寻找一些日期格式 UDF,它将日期时间转换为字符串,如 FormatDate.我不确定我们在 pig/piggybank 中是否已经有这些 UDF,因为我在文档中找不到它.

I am looking some date formatting UDFs which wil convert datetime to String like FormatDate. I am not sure we have these UDF's already in pig/piggybank as i could not find it in documentation.

另外,是否有任何其他 3rd 方 udfs java/python 可用.请列出那些.

Also, are there any other 3rd party udfs java/python available. Please list those.

非常感谢您的帮助.

推荐答案

所以这里有几个问题.我会尽量涵盖所有内容.

So there's a few questions here. I'll try to cover them all.

PiggyBank 文档

(遗憾的是)没有用户手动使用 piggybank UDF 来解释如何在 pigscript 中使用它们中的每一个.但是,Pig javadoc 包含在存钱罐中实现 UDF 的每个 java cass 的信息(向下滚动到contrib:Piggybank"):

There (sadly) is no user manually for piggybank UDF's that explains how to use each of them from within a pigscript. However, the Pig javadoc includes information for each java cass implementing the UDFs in piggy bank (scroll down to "contrib: Piggybank"):

字符串到日期时间

(假设猪 <0.11)

(assuming pig < 0.11)

要转换包含时间信息的字符串,您需要使用 CustomFormatToISO UDF.这将带数据信息和日期时间格式规范的字符数组转换为 ISO 日期时间格式.一旦采用这种格式,就有几个 Piggybank 函数可以在 ISO 格式的时间上运行:

To convert a string containing time like information, you'll want to use the CustomFormatToISO UDF. This takes your chararray with data information and a datetime format specification and converts it into an ISO datetime format. Once in this format, there are several Piggybank functions that operate on ISO formatted time:

另请注意,ISO 格式的字符串比较会导致日期排序.这意味着您可以对它们应用比较和排序操作,它们的行为就好像它们具有时间意识.有关更多背景信息,请参阅此 SO 答案:https://stackoverflow.com/a/9576911/9940

Note also that ISO formatted strings comparisons result in date sorting. This means you can apply comparison and sort operations on them, and they will behave as if they are time aware. For more background see this SO answer: https://stackoverflow.com/a/9576911/9940

如果您使用的是 0.11 plus,则可以使用内置的 ToDate() 函数:http://pig.apache.org/docs/r0.11.1/func.html#to-date

If you're using 0.11 plus you can use the built in ToDate() function: http://pig.apache.org/docs/r0.11.1/func.html#to-date

这篇关于Pig 3rd 方 UDF 澄清的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆