猪加入 java.lang.ClassCastException: java.lang.String 不能转换为 java.lang.Integer [英] pig join with java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer

查看:49
本文介绍了猪加入 java.lang.ClassCastException: java.lang.String 不能转换为 java.lang.Integer的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个文件,在 data1

1 3
1 2
5 1

data2

2 3
2 4

然后我尝试将它们读入猪

I then tried to read them into pig

d1 = LOAD 'data1';
d2 = foreach d1 generate flatten(STRSPLIT($0, ' +')) as (f1:int,f2:int);
d3 = LOAD 'data2' ;
d4 = foreach d3 generate flatten(STRSPLIT($0, ' +')) as (f1:int,f2:int);
data = join d2 by f1, d4 by f2;

然后我得到了

2013-08-04 00:48:26,032 [Thread-21] WARN  org.apache.hadoop.mapred.LocalJobRunner - job_local_0005
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
    at org.apache.pig.backend.hadoop.HDataType.getWritableComparableTypes(HDataType.java:85)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:112)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:285)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

有人可以帮我吗?谢谢.

Could anybody help me? Thank you.

推荐答案

首先,我会为输入定义一个简单的架构.根据您的示例,我假设您的输入是文本文件.
现在你得到了 ClassCastException,因为仅仅应用架构 (f1:int, f2:int) 不幸的是不会做任何转换.您需要将 STRSPLIT 的输出模式显式转换为 (tuple(int,int)) 以便 flatten 可以生成 f1:int 和 f2:int从它.即:

First I'd define a simple schema for the inputs. Based on your example I assume that your inputs are text files.
Now you get the ClassCastException because just applying the schema (f1:int, f2:int) unfortunately won't do any conversion. You need to explicitly cast the output schema of STRSPLIT to (tuple(int,int)) so that flatten can generate f1:int and f2:int from it. I.e:

d1 = LOAD 'data1' as (line:chararray);
d2 = foreach d1 generate flatten((tuple(int,int))(STRSPLIT($0, ' +'))) 
       as (f1:int,f2:int);

d3 = LOAD 'data2' as (line:chararray);
d4 = foreach d3 generate flatten((tuple(int,int))(STRSPLIT($0, ' +')))
       as (f1:int,f2:int);

data = join d2 by f1, d4 by f2;

这篇关于猪加入 java.lang.ClassCastException: java.lang.String 不能转换为 java.lang.Integer的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆