hadoop流式传输失败,错误代码为5 [英] hadoop streaming failed with error code 5

查看:428
本文介绍了hadoop流式传输失败,错误代码为5的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(HADOOP_CMD =/ usr / local / hadoop / bin / hadoop)Sys.setenv(HADOOP_STREAMING =/ usr / local / hadoop /share/hadoop/tools/lib/hadoop-streaming-2.4.1.jar\")Sys.setenv(HADOOP_HOME=\"/usr/local/hadoop\")library(rmr2)## map functionmap< - function(k, (词,1))} ##减少函数减少< - 函数(()){words.list< - strsplit(lines,'\\'')words< - unlist(words.list)return (输入,输出= NULL){mapreduce(输入=输入,输出=输出,输入。格式=文本,地图=地图,减少=减少)} ##提交jobhdfs.root< - 'input'#hdfs.data< - file.path(hdfs.root,'data')hdfs.out< - file.path(hdfs.root, 'out')out< - wordcount(hdfs.root,hdfs.out)##从HDFSresult获取结果s < - from.dfs(out)## check top 2 frequent wordsresults.df< - as.data.frame(results,stringsAsFactors = F)colnames(results.df)< - c('word',' count')head(results.df [order(results.df $ count,decrease = T),],2)


为了检查RHadoop集成,我使用了在Rscript中执行的上述wordcount程序。但是我收到了我在下面显示的错误。



  15/01/21 13:48:52 WARN util.NativeCodeLoader:无法为您的平台加载native-hadoop库...使用builtin-java类在适用的情况下packageageJobJar:[/ usr / local / hadoop / data / hadoop-unjar5866699842450503195 /] [/tmp/streamjob7335081573862861018.jar tmpDir = null15 / 01/21 13:48:53 INFO client.RMProxy:连接到ResourceManager在本地主机/ 127.0.0.1:805015/01/21 13:48:53 INFO client.RMProxy:连接到本地主机上的ResourceManager / 127.0.0.1:805015/01/21 13:48:53错误streaming.StreamJob:错误启动作业:Permission denied:user = pgl-26,access = EXECUTE,inode =/ tmp:hduser:supergroup:drwxrwx --- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java: 265)at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)at o rg.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:205)位于org.apache的org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:168)。 hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5523)at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3521)at org.apache.hadoop.hdfs。 server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:779)at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:764)at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos $ ClientNamenodeProtocol $ 2.callBlockingMethod(ClientNamenodeProtocolProtos.java)at org.apache.hadoop.ipc.ProtobufRpcEngine $ Server $ ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)at org.apache.hadoop.ipc.RPC $ Server.call(RPC.java :928)at org.apache.hadoop.ipc.Server $ Handler $ 1.run(Ser org.apache.hadoop.ipc.Server $ Handler $ 1.run(Server.java:2009)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs (Subject.java:415)at org.apache.hadoop.ipc.Server $ Handler.run(Server.java:2007)org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)Streaming Command Failed !mr中的错误(map = map,reduce = reduce,combine = combine,vectorized.reduce,:hadoop streaming failed with error code 5  



请帮我解决这个错误。我对R和Hadoop都是新手。我无法确定自己出错的地方。 给临时目录的权限,比如 hadoop fs -chown -R rdodoop / tmp



其中 rhadoop 是用户名


RHadoop program for wordcount:

   
Sys.setenv(HADOOP_CMD="/usr/local/hadoop/bin/hadoop")
Sys.setenv(HADOOP_STREAMING="/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.4.1.jar")
Sys.setenv(HADOOP_HOME="/usr/local/hadoop")
library(rmr2) 

## map function
map <- function(k,lines) {
  words.list <- strsplit(lines, '\\s') 
  words <- unlist(words.list)
  return( keyval(words, 1) )
}

## reduce function
reduce <- function(word, counts) { 
  keyval(word, sum(counts))
}

wordcount <- function (input, output=NULL) { 
  mapreduce(input=input, output=output, input.format="text", 
            map=map, reduce=reduce)
}


   

## Submit job
hdfs.root <- 'input'
#hdfs.data <- file.path(hdfs.root, 'data') 
hdfs.out <- file.path(hdfs.root, 'out') 
out <- wordcount(hdfs.root, hdfs.out)

## Fetch results from HDFS
results <- from.dfs(out)

## check top 2 frequent words
results.df <- as.data.frame(results, stringsAsFactors=F) 
colnames(results.df) <- c('word', 'count') 
head(results.df[order(results.df$count, decreasing=T), ], 2)

To check the RHadoop integration, I have used the above wordcount program executed in Rscript. But I am receiving errors which I have displayed below.

15/01/21 13:48:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
packageJobJar: [/usr/local/hadoop/data/hadoop-unjar5866699842450503195/] [] /tmp/streamjob7335081573862861018.jar tmpDir=null
15/01/21 13:48:53 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050
15/01/21 13:48:53 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050
15/01/21 13:48:53 ERROR streaming.StreamJob: Error Launching job : Permission denied: user=pgl-26, access=EXECUTE, inode="/tmp":hduser:supergroup:drwxrwx---
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:205)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:168)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5523)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3521)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:779)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:764)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce,  : 
  hadoop streaming failed with error code 5

Please help me regarding the error. I am new to R as well as hadoop. I couldn't make sure where I have gone wrong.

解决方案

Give permission to temp directory like hadoop fs -chown -R rhadoop /tmp.

Where rhadoop is username

这篇关于hadoop流式传输失败,错误代码为5的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
分布式计算/Hadoop最新文章
热门教程
热门工具
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆