Apache Pig权限问题 [英] Apache Pig permissions issue

查看:338
本文介绍了Apache Pig权限问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图让Apache Pig在我的Hadoop集群上运行,并遇到权限问题。 Pig本身正在启动并连接到群集 - 从Pig shell中,我可以通过和在我的HDFS目录中及其周围实现 ls 。但是,当我尝试实际加载数据并运行Pig命令时,我遇到了与权限相关的错误:

I'm attempting to get Apache Pig up and running on my Hadoop cluster, and am encountering a permissions problem. Pig itself is launching and connecting to the cluster just fine- from within the Pig shell, I can ls through and around my HDFS directories. However, when I try and actually load data and run Pig commands, I run into permissions-related errors:

grunt> A = load 'all_annotated.txt' USING PigStorage() AS (id:long, text:chararray, lang:chararray);
grunt> DUMP A;
2011-08-24 18:11:40,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - You don't have permission to perform the operation. Error from the server: org.apache.hadoop.security.AccessControlException: Permission denied: user=steven, access=WRITE, inode="":hadoop:supergroup:r-xr-xr-x
2011-08-24 18:11:40,977 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A
Details at logfile: /Users/steven/Desktop/Hacking/hadoop/pig/pig-0.9.0/pig_1314230681326.log
grunt> 

在这种情况下, all_annotated.txt 是我创建的HDFS主目录中的文件,并且绝对有权访问;无论我尝试 load 的哪个文件,都会发生同样的问题。但是,我不认为这是问题,因为错误本身表明Pig正试图在某处写入 。谷歌搜索,我发现一些邮件列表帖子建议某些Pig Latin语句( order 等)需要写入HDFS文件系统上一个临时目录的位置由hdfsd-site.xml中的 hadoop.tmp.dir 属性控制。我不认为 load 属于该类别,但只是可以肯定的是,我将 hadoop.tmp.dir 指向我的HDFS主目录中的一个目录,并且问题仍然存在。

In this case, all_annotated.txt is a file in my HDFS home directory that I created, and most definitely have permissions to; the same problem occurs no matter what file I try to load. However, I don't think that's the problem, as the error itself indicates Pig is trying to write somewhere. Googling around, I found a few mailing list posts suggesting that certain Pig Latin statements (order, etc.) need write access to a temporary directory on the HDFS file system whose location is controlled by the hadoop.tmp.dir property in hdfsd-site.xml. I don't think load falls into that category, but just to be sure, I changed hadoop.tmp.dir to point to a directory within my HDFS home directory, and the problem persisted.

所以,任何人都会有什么想法继续?

So, anybody out there have any ideas as to what might be going on?

推荐答案

可能是您的pig.temp.dir设置。它默认为hdfs上的/ tmp。猪会在那里写下暂时的结果。如果你没有权限/ tmp,Pig会抱怨。尝试用-Dpig.temp.dir覆盖它。

Probably your pig.temp.dir setting. It defaults to /tmp on hdfs. Pig will write temporary result there. If you don't have permission to /tmp, Pig will complain. Try to override it by -Dpig.temp.dir.

这篇关于Apache Pig权限问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆