Apache Pig 权限问题 [英] Apache Pig permissions issue

查看:48
本文介绍了Apache Pig 权限问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在我的 Hadoop 集群上启动并运行 Apache Pig,但遇到了权限问题.Pig 本身正在启动并连接到集群就好了——在 Pig shell 中,我可以ls 遍历我的 HDFS 目录.但是,当我尝试实际加载数据并运行 Pig 命令时,我遇到了与权限相关的错误:

I'm attempting to get Apache Pig up and running on my Hadoop cluster, and am encountering a permissions problem. Pig itself is launching and connecting to the cluster just fine- from within the Pig shell, I can ls through and around my HDFS directories. However, when I try and actually load data and run Pig commands, I run into permissions-related errors:

grunt> A = load 'all_annotated.txt' USING PigStorage() AS (id:long, text:chararray, lang:chararray);
grunt> DUMP A;
2011-08-24 18:11:40,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - You don't have permission to perform the operation. Error from the server: org.apache.hadoop.security.AccessControlException: Permission denied: user=steven, access=WRITE, inode="":hadoop:supergroup:r-xr-xr-x
2011-08-24 18:11:40,977 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A
Details at logfile: /Users/steven/Desktop/Hacking/hadoop/pig/pig-0.9.0/pig_1314230681326.log
grunt> 

在这种情况下,all_annotated.txt 是我创建的 HDFS 主目录中的一个文件,并且绝对有权限;无论我尝试加载什么文件,都会出现同样的问题.但是,我认为这不是问题所在,因为错误本身表明 Pig 正在尝试在某处写入.谷歌搜索,我发现一些邮件列表帖子表明某些 Pig Latin 语句(order 等)需要对 HDFS 文件系统上的临时目录进行写访问,该目录的位置由 控制hdfsd-site.xml 中的 hadoop.tmp.dir 属性.我不认为 load 属于该类别,但可以肯定的是,我将 hadoop.tmp.dir 更改为指向一个目录在我的 HDFS 主目录中,问题仍然存在.

In this case, all_annotated.txt is a file in my HDFS home directory that I created, and most definitely have permissions to; the same problem occurs no matter what file I try to load. However, I don't think that's the problem, as the error itself indicates Pig is trying to write somewhere. Googling around, I found a few mailing list posts suggesting that certain Pig Latin statements (order, etc.) need write access to a temporary directory on the HDFS file system whose location is controlled by the hadoop.tmp.dir property in hdfsd-site.xml. I don't think load falls into that category, but just to be sure, I changed hadoop.tmp.dir to point to a directory within my HDFS home directory, and the problem persisted.

那么,有人对可能发生的事情有任何想法吗?

So, anybody out there have any ideas as to what might be going on?

推荐答案

可能是你的 pig.temp.dir 设置.它在 hdfs 上默认为/tmp.Pig 会在那里写临时结果.如果您没有/tmp 的权限,Pig 会抱怨.尝试通过 -Dpig.temp.dir 覆盖它.

Probably your pig.temp.dir setting. It defaults to /tmp on hdfs. Pig will write temporary result there. If you don't have permission to /tmp, Pig will complain. Try to override it by -Dpig.temp.dir.

这篇关于Apache Pig 权限问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆