在 Pig 中编写 udf 有点像教程 [英] writing a udf in pig kind of like tutorial

查看:29
本文介绍了在 Pig 中编写 udf 有点像教程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Pig 的新手.我正在尝试编写一个 udf 函数.

I am new to pig.. and am trying to write a udf function.

所以基本上这里是问题陈述.

So basically here is the problem statement.

我有一个像这样的虚拟数据..

I have a dummy data like this..

 user_id, movie_id, date_time_stamp

所以我想做的是这个.如果交易在

So what I am trying to do is this. if the transaction is between

    9 am and 11 am --> breakfast
    and so on

这是我的猪脚本

     REGISTER path/myudfs.jar
      in = LOAD 'path/input' USING  
          PigStorage('\\u001') AS (user:long,movie:long, time:chararray);

     result = foreach in GENERATE  myudfs.time(time);
     STORE result INTO 'path/output/time' using PigStorage(',');

现在myudf.jar java代码是这样的

Now myudf.jar java code is like this

      public class time extends EvalFunc<String>{

public String exec(Tuple input) throws IOException {

    if ((input == null) || (input.size() == 0))
        return null;
    try{
        String time = (String) input.get(0) ;
        DateFormat df = new SimpleDateFormat("hh:mm:ss.000");
        Date date = df.parse(time);
        String timeOfDay = getTimeOfDay(date);
        return timeOfDay;
    } catch (ParseException e) {
        //how will I handle when df.parse(time) fails and throws ParseException?
        //maybe:
        return null;
    }


}

所以它接受元组并返回一个字符串......(我也是java新手......)

So it takes in the tuple and returns a string... (I am new to java also..)

在此之后我尝试运行这个脚本

After this i try to run this script as

 pig -f time.pig

它返回一个错误

   2012-11-12 08:33:08,214 [main] INFO    
    org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to  
  hadoop file system at: maprfs:///
  2012-11-12 08:33:08,353 [main] INFO  
      org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to  
                         map-reduce job tracker at: maprfs:///
  2012-11-12 08:33:08,767 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1069:  
     Problem resolving class version numbers for class myudfs.time

pig 邮件列表上有人发帖说我的 PIG_CLASSPATH 没有设置,我应该把它指向/path/hadoop/conf

Some one posted on pig mailing list is that my PIG_CLASSPATH is not set and that i should point it to /path/hadoop/conf

我这样做了..所以现在 $echo PIG_CLASSPATH -->/path/hadoop/conf

I did that.. so now $echo PIG_CLASSPATH --> /path/hadoop/conf

但我得到同样的错误

请指教.谢谢

编辑 1:在查看日志时,错误跟踪是:

Edit 1: On looking into the log, the error trace is:

     Caused by: java.lang.UnsupportedClassVersionError: myudfs/time : Unsupported major.minor version 51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:427)
... 27 more

这像是 Java 问题吗?

is this like a java issue?

推荐答案

要查找 jar 版本,请使用 winzip(或类似工具)打开 jar 并查找 manifest.mf.那里应该有一行写着Created-By",这将给出用于构建 jar 的 java 版本.

To find the jar version, open the jar using winzip (or similar) and look for manifest.mf. There should be a line in there that says 'Created-By' and this will give the version of java that was used to build the jar.

这需要早于或等于您用于构建应用程序的 Java 版本.如果您在命令行中执行此操作,请键入:

This needs to be older or equal to the version of java you are using to build your app. If you are doing this at the command line type:

java -version

或在 eclipse 中转到

or in eclipse go to

project(menu) > properties (menu item) > java build path (in list) > libraries (tab)

并查看您用于 JDK/JRE 的版本(您可以从目录中得知这一点,如果没有,则转到该目录并执行 java -version).

and take a look at the version that you are using for the JDK/JRE (you may be able to tell this from the directory, if not then go to that directory and do java -version).

您可能需要更新 Eclipse 中的 Java 版本.

Chances are you'll need to update the version of java you have in eclipse.

这篇关于在 Pig 中编写 udf 有点像教程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆