运行地图可将作业减少为不同的用户 [英] Running a map reduce job as a different user

查看:107
本文介绍了运行地图可将作业减少为不同的用户的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个与Hadoop交互的Web应用程序。 (Cloudera cdh3u6)特定用户操作应在集群中启动新的Map Reduce作业。



群集不是一个安全的群集,但它使用简单的群组身份验证 - 所以如果我以自己的身份ssh,我可以从命令行启动MR作业。



在Web应用程序中,我使用 ToolRunner 来运行我的作业:

  MyMapReduceWrapperClass mr = new MyMapReduceWrapperClass(); 
ToolRunner.run(mr,null);


//在我的包装类的运行实现中:
Job job = new Job(conf,job title);
//设置删除的东西
job.submit();

目前,此作业是作为启动Web应用程序服务器(Tomcat)进程的用户提交的,用户是此Web服务器上的特殊本地帐户,它无权将作业发送到群集。

理想情况下,我希望能够从用户那里获得某种身份并传递给用户,以便不同的用户与网络应用/服务进行交互我们可以看到谁在调用什么工作。跳过如何真正协调这些凭证服务的问题,我甚至不清楚它会去哪里。



我发现在 Job 我有一个 getCredentials()选项,但是通过阅读关于令牌/ Kerberos的东西,我有这样的印象:群集(我认为我们不是) - 更不用说我不认为我的web服务器安装了Kerberos。这可能是固定的。但是,这听起来像是预期的用例是添加地图缩减作业在运行时访问其他服务时可能需要的秘密 - 而不是与其他人一样运行作业。

我也看到在(旧的) JobConf 类中,我有能力 setUser(String name)这看起来很有前途 - 尽管我不知道它会在哪里需要密码或什么 - 但我找不到那个函数的很多信息或文档。我试了一下,没有任何影响 - 仍然以Tomcat用户的身份提交。

是否有其他探索或研究途径?我对Google没有任何关键词。我宁愿没有选项只给你的集群上的tomcat用户权限 - 我不管理该资产,我不希望这个请求飞行。如果这个字面上是我唯一的选择,我想明白这是为什么,这样我就可以争论需要,有正确的信息。 / b>您可以像这样使用 UserGroupInformation 类:

  UserGroupInformation ugi = UserGroupInformation.createRemoteUser(username); 
ugi.doAs(new PrivilegedExceptionAction< MyMapReduceWrapperClass>(){
public Object run()throws Exception {
MyMapReduceWrapperClass mr = new MyMapReduceWrapperClass();
ToolRunner.run(mr, null);
return mr;
}
});


I have a web application which interacts with Hadoop. (Cloudera cdh3u6) A particular user operation should launch a new Map Reduce job in the cluster.

The cluster is not a secure cluster, but it uses simple group authentication - so if I ssh to it as myself, I can launch MR jobs from the command line.

In the web application, I'm using the ToolRunner to run my job:

MyMapReduceWrapperClass mr = new MyMapReduceWrapperClass();
ToolRunner.run(mr, null);


// inside the run implementation of my wrapper class : 
Job job = new Job(conf, "job title");
//set up stuff removed
job.submit();

Currently this job is submitted as the user that launched the web application server (Tomcat) process, and that user is a special local account on this web server that doesn't have permissions to send jobs to the cluster.

Ideally I'd like to be able to get some kind of identity from the user and pass it along, so that as different users were interacting with the web app / service we could see who was invoking what jobs. Skipping over the issues of how to actually coordinate those credential services, I'm not even clear on where it would go.

I see that on a Job I have a getCredentials() option, but from reading about the token / Kerberos stuff in there I have the impression that this is for secured clusters (which I think we are not) - not to mention I don't think my webserver has Kerberos installed. That could be fixed though. But it also sounds like the intended use case is to add secrets that a map reduce job might want while running to access other services - and not about running the job as someone else.

I also see that on the (older?) JobConf class I have the ability to setUser(String name) which seems promising - even though I don't know where it would require a password or something - but I can't find much information or documentation on that function. I tried it out and it had no impact - the job was still submitted as the Tomcat user.

Are there other avenues to explore or research? I am out of key words to Google. I would prefer to not have the option "Just give your tomcat user permissions on the cluster" - I don't manage that asset and I don't expect that request to fly. If however that literally is my only option I'd like to understand why that is, so that I can argue the need, having the right information.

解决方案

You can use the UserGroupInformation class like this:

UserGroupInformation ugi = UserGroupInformation.createRemoteUser(username);
ugi.doAs(new PrivilegedExceptionAction<MyMapReduceWrapperClass>() {
    public Object run() throws Exception {
        MyMapReduceWrapperClass mr = new MyMapReduceWrapperClass();
        ToolRunner.run(mr, null);
        return mr;
    }
});

这篇关于运行地图可将作业减少为不同的用户的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆