在hadoop上运行RecommenderJob时遇到问题 [英] Trouble running RecommenderJob on hadoop

查看:168
本文介绍了在hadoop上运行RecommenderJob时遇到问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我试图运行以下命令。

在hdfs的输入目录中添加lined-sinple-sorted.txt和users.txt后,

  hduser @ ubuntu:/ usr / local / hadoop $ bin / hadoop jar /opt/mahout/core/target/mahout-core-0.7-快照job.jar org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -Dmapred.input.dir =输入/ input.txt中-Dmapred.output.dir =输出--similarityClassname SIMILARITY_PEARSON_CORRELATION --usersFile输入/用户.txt --booleanData 

然后我得到以下错误:

  12/03/02 06:17:06 INFO common.AbstractJob:命令行参数:{--booleanData = [false],--endPhase = [2147483647] ,--maxPrefsPerUser = [10],--maxPrefsPerUserInItemSimilarity = [1000],--maxSimilaritiesPerItem = [100],--minPrefsPerUser = [1],--numRecommendations = [10],--similarityClassname = [SIMILARITY_PEARSON_CORRELATION], - -startPhase = [0],--tempDir = [temp],--usersFile = [input / users.txt]} 
12/03/02 06:17:06 INFO common。 AbstractJob:命令行参数:{--booleanData = [false],--endPhase = [2147483647],--input = [input / input.txt], - maxPrefsPerUser = [1000],--minPrefsPerUser = [1] ,--output = [temp / preparePreferenceMatrix],--ratingShift = [0.0],--startPhase = [0],--tempDir = [temp]}
12/03/02 06:17:07 INFO input.FileInputFormat:总输入路径的过程:1个
02年12月3日6时17分08秒INFO mapred.JobClient:正在运行的作业:job_201203020113_0018
02年12月3日6时17分09秒INFO mapred .JobClient:地图0%减少0%
02年12月3日6时17分23秒INFO mapred.JobClient:任务标识:attempt_201203020113_0018_m_000000_0,状态:FAILED
java.lang.ArrayIndexOutOfBoundsException:1

at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper .map(ItemIDIndexMapper.java:31)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask .java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

02年12月3日6点17分29秒INFO mapred.JobClient:任务标识:attempt_201203020113_0018_m_000000_1,状态:失败
java.lang.ArrayIndexOutOfBoundsException:1个

在org.apache.mahout .cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)$ b在org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
$ b在org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
。在组织.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

12 / 03/02 6时17分35秒INFO mapred.JobClient:任务标识:attempt_201203020113_0018_m_000000_2,状态:失败
java.lang.ArrayIndexOutOfBoundsException:1个

在org.apache.mahout.cf.taste .hadoop.item.Ite在org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
在org.apache.hadoop处的
(mIDIndexMapper.map(ItemIDIndexMapper.java:47))。 mapreduce.Mapper.run(Mapper.java:144)
处org.apache.hadoop.mapred.MapTask org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
。运行(MapTask.java:305)
在org.apache.hadoop.mapred.Child.main(Child.java:170)

12/03/02 06:17:44信息mapred.JobClient:工作完成:job_201203020113_0018
02年12月3日6时17分44秒INFO mapred.JobClient:计数器:3
02年12月3日6时17分44秒INFO mapred.JobClient:工作计数器
12/03/02 06:17:44信息mapred.JobClient:启动的地图任务= 4
12/03/02 06:17:44信息mapred.JobClient:数据本地地图任务= 4
12/03/02 06:17:44信息mapred.JobClient:失败的映射任务= 1
线程main中的异常java.io.IOException:无法打开filename / user / hduser / temp /preparePreferenceMatrix/numUsers.bin
在org.apache.hadoop。 hdfs.DFSClient $ DFSInputStream.openInfo(DFSClient.java:1497)
at org.apache.hadoop.hdfs.DFSClient $ DFSInputStream。< init>(DFSClient.java:1488)
at org.apache .hadoop.hdfs.DFSClient.open(DFSClient.java:376)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178)
at org.apache.hadoop.fs .FileSystem.open(FileSystem.java:356)
在org.apache.mahout.common.HadoopUtil.readInt(HadoopUtil.java:267)
在org.apache.mahout.cf.taste.hadoop .item.RecommenderJob.run(RecommenderJob.java:162)
在org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
在org.apache.mahout.cf.taste .hadoop.item.RecommenderJob.main(RecommenderJob.java:293)
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)
在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at ja va.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

我必须做些什么才能从这个错误中解脱出来?(是否可以写命令)?


解决方案

您的输入格式不正确。它需要以制表符或逗号分隔。


After adding lined-sinple-sorted.txt and users.txt in input directory of hdfs.

I am trying to run the following command.

hduser@ubuntu:/usr/local/hadoop$ bin/hadoop jar /opt/mahout/core/target/mahout-core-0.7-SNAPSHOT-job.jar org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -Dmapred.input.dir=input/input.txt -Dmapred.output.dir=output --similarityClassname SIMILARITY_PEARSON_CORRELATION --usersFile input/users.txt --booleanData

then i got the following error:

12/03/02 06:17:06 INFO common.AbstractJob: Command line arguments: {--booleanData=[false], --endPhase=[2147483647], --maxPrefsPerUser=[10], --maxPrefsPerUserInItemSimilarity=[1000], --maxSimilaritiesPerItem=[100], --minPrefsPerUser=[1], --numRecommendations=[10], --similarityClassname=[SIMILARITY_PEARSON_CORRELATION], --startPhase=[0], --tempDir=[temp], --usersFile=[input/users.txt]}
12/03/02 06:17:06 INFO common.AbstractJob: Command line arguments: {--booleanData=[false], --endPhase=[2147483647], --input=[input/input.txt], --maxPrefsPerUser=[1000], --minPrefsPerUser=[1], --output=[temp/preparePreferenceMatrix], --ratingShift=[0.0], --startPhase=[0], --tempDir=[temp]}
12/03/02 06:17:07 INFO input.FileInputFormat: Total input paths to process : 1
12/03/02 06:17:08 INFO mapred.JobClient: Running job: job_201203020113_0018
12/03/02 06:17:09 INFO mapred.JobClient:  map 0% reduce 0%
12/03/02 06:17:23 INFO mapred.JobClient: Task Id : attempt_201203020113_0018_m_000000_0, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 1

    at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
    at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

12/03/02 06:17:29 INFO mapred.JobClient: Task Id : attempt_201203020113_0018_m_000000_1, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 1

    at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
    at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

12/03/02 06:17:35 INFO mapred.JobClient: Task Id : attempt_201203020113_0018_m_000000_2, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 1

    at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
    at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

12/03/02 06:17:44 INFO mapred.JobClient: Job complete: job_201203020113_0018
12/03/02 06:17:44 INFO mapred.JobClient: Counters: 3
12/03/02 06:17:44 INFO mapred.JobClient:   Job Counters
12/03/02 06:17:44 INFO mapred.JobClient:     Launched map tasks=4
12/03/02 06:17:44 INFO mapred.JobClient:     Data-local map tasks=4
12/03/02 06:17:44 INFO mapred.JobClient:     Failed map tasks=1
Exception in thread "main" java.io.IOException: Cannot open filename /user/hduser/temp/preparePreferenceMatrix/numUsers.bin
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1497)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1488)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:376)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356)
    at org.apache.mahout.common.HadoopUtil.readInt(HadoopUtil.java:267)
    at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:162)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:293)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

what i have to do to come out from this error?(is it possible then write command)

Your help will be appreciated.

解决方案

Your input is malformed. It needs to be tab or comma separated.

这篇关于在hadoop上运行RecommenderJob时遇到问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆