使用Kerberos从其他作业的Java操作中提交Oozie作业 [英] Submit Oozie Job from another job's java action with Kerberos

查看:1615
本文介绍了使用Kerberos从其他作业的Java操作中提交Oozie作业的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用 Java客户端提交Oozie作业来自另一个Job的java动作的API 。该群集正在使用Kerberos。



以下是我的代码:

  //获取OozieClient for本地Oozie 
字符串oozieUrl =http://hadooputl02.northamerica.xyz.net:11000/oozie/;
AuthOozieClient wc = new AuthOozieClient(oozieUrl);

wc.setDebugMode(1);
//创建工作流作业配置并设置工作流应用程序路径
属性conf = wc.createConfiguration();
conf.setProperty(OozieClient.APP_PATH,wfAppPath);
conf.setProperty(jobTracker,yarnRM);
conf.setProperty(nameNode,hdfs:// ingestiondev);

//提交并启动工作流作业
String jobId = wc.run(conf);
System.out.println(提交的工作流作业);

但是我收到以下错误:

  org.apache.oozie.action.hadoop.JavaMainException:IO_ERROR:
java.io.IOException:连接Oozie服务器时发生错误。重试次数= 1。例外=无法验证,GSSException:没有提供有效的凭证(机制级别:未能找到任何Kerberos tgt)
...
导致:AUTHENTICATION:无法验证,GSSException :没有提供有效的凭证(机制级别:无法找到任何Kerberos tgt)
...
引起:org.apache.hadoop.security.authentication.client.AuthenticationException:GSSException:没有提供有效凭证(机制级别:无法找到任何Kerberos tgt)
...
引起:GSSException:未提供有效凭据(机制级别:无法找到任何Kerberos tgt)

我相信在代码中有更多需要通过kerberos为节点/用户提供对oozie服务器的访问权。



有人可以指向在Kerberized群集上使用Oozie Java API的正确方式吗?



谢谢!

解决方案

错误消息是明确的:无法找到任何Kerberos tgt 。您的作业运行在YARN容器中,随机节点上,并且没有可用的Kerberos票据。



您是否想知道Oozie如何使用您的Kerberos凭据开始工作,即使它不知道你的密码?这是因为它使用Hadoop内建的后门。但是,您的工作没有适当的Kerberos凭据,因此您在尝试执行某些操作时看到的消息未被覆盖。


How Oozie管理没有凭证的认证


  • 连接到Edge节点,用<$ c $创建Kerberos票证c> kinit ,运行一个Oozie命令行来提交一个协调器(它将在特定的日期和时间触发一个工作流程)

  • Oozie CLI根据Oozie进行身份验证服务器与本地Kerberos票证,因此协调员(和工作流程)属于你当协调员触发工作流程时,
  • ,并且工作流程启动一个操作,并且该操作启动一个YARN工作......这是Oozie服务器对YARN ResourceManager进行身份验证(通常为 oozie ) - 您的Kerberos票证很可能已过期 因为 oozie 被定义为特权帐户代理帐户,所以

  • ng>在YARN配置中,然后RM接受在您的帐户下启动作业,即使您没有通过Kerberos正确认证

  • 它怎么可能? ?因为内部YARN和HDFS使用委托令牌 - 通常,您使用Kerberos验证一次,然后获得令牌,并且适用于所有节点上的所有核心服务;与Oozie在混合,你甚至不必认证......



但有一个问题:代表令牌不适用于任何使用纯Kerberos身份验证的服务 - 即Hive Metastore,Hive JDBC,HBase,ZooKeeper,Oozie等。

这就是为什么Oozie有一个解决方法: explicit <凭证> 请求,用于Hive操作,Hive2操作,HBase操作等。 [声明:我真的不知道它是如何工作的]



我怀疑这些凭据中的任何一个都可以对Oozie本身起作用......!


您可以如何管理自己的自定义身份验证
$ b


  1. 生成<$ c $ (参见Linux命令 ktutil

  2. c> keytab
  3. 将该文件上传到HDFS ,访问受限 - 因为任何可以访问该文件的人都可以 登录!!! li>
  4. 告诉Oozie使用< file> 来下载运行Java动作的容器中的文件 - 它将在当前工作目录,因此无需关心实际路径

  5. 每当Oozie REST服务器通过SPNEGO请求验证时,创建一个向Java解释的JAAS配置文件,使用此主体(其密码位于该密钥表文件中(而不是默认的寻找票证缓存并在那里获取现有票证)创建Kerberos票证 em>)

  6. 将该JAAS配置文件上传到HDFS,使用另一个< file> 等。 b
  7. 激活具有Java系统属性的JAAS配置

您可以在我的这篇文章中找到更多详细信息:在kerber下使用JDBC连接到impala时出错os authrication



免责声明:我不知道Oozie预计哪个JAAS主题(例如,ZooKeeper期望 Client ,Hive希望 com.sun.security.jgss.krb5.initiate



c>添加到容器CWD中的临时文件(当作业停止时将自动销毁)
  • 产生Linux命令 kinit -kt myname.keytab myname @ REALM ,它将获得由 KRB5CCNAME

  • 定义的缓存中的Kerberos票证,并让JAAS关注默认流程


  • I am trying to submit an Oozie job using Java Client API from another Job's java action. The cluster is using Kerberos.

    Here is my code:

    // get a OozieClient for local Oozie
        String oozieUrl = "http://hadooputl02.northamerica.xyz.net:11000/oozie/";
        AuthOozieClient wc = new AuthOozieClient(oozieUrl);
    
        wc.setDebugMode(1);
    // create a workflow job configuration and set the workflow application path
        Properties conf = wc.createConfiguration();
        conf.setProperty(OozieClient.APP_PATH, wfAppPath);
        conf.setProperty("jobTracker", "yarnRM");
        conf.setProperty("nameNode", "hdfs://ingestiondev");
    
    // submit and start the workflow job
        String jobId = wc.run(conf);
        System.out.println("Workflow job submitted");
    

    But I am getting the following error:

     org.apache.oozie.action.hadoop.JavaMainException: IO_ERROR : 
    java.io.IOException: Error while connecting Oozie server. No of retries = 1. Exception = Could not authenticate, GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
          ...
     Caused by: AUTHENTICATION : Could not authenticate, GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
          ...
    Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
          ...
    Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
    

    I believe there is something more required in the code to give the node/user access to the oozie server through kerberos.

    Can someone point me to the correct way to use Oozie Java API on a Kerberized cluster?

    thanks!

    解决方案

    The error message is explicit: Failed to find any Kerberos tgt. Your job runs in a YARN container, on a random node, and has no Kerberos ticket available there.

    Did you ever wonder how Oozie could start a job with your Kerberos credentials, even though it does not know your password? That's because it uses a backdoor built inside Hadoop. But then your job has no proper Kerberos credentials, hence the message you see when you try to do something not covered.


    How Oozie manages authentication without credentials

    • you connect to an Edge Node, create a Kerberos ticket with kinit, run an Oozie command line to submit a Coordinator (which will fire a Workflow at specific dates and times)
    • the Oozie CLI authenticates against the Oozie server with the local Kerberos ticket, so the Coordinator (and Workflow) "belong to you"
    • when the Coordinator triggers the Workflow, and the Workflow starts an Action, and the Action starts a YARN job... it's the Oozie server that authenticates against YARN ResourceManager (typically as oozie) -- your Kerberos ticket has probably expired long ago
    • but since oozie is defined as a priviledged proxy account in YARN config, then the RM accepts to start the job under your account, even though you did not properly authenticate via Kerberos
    • how is it possible?? because internally YARN and HDFS use a delegation token -- usually, you authenticate once with Kerberos, then you get a token, and you are good for all core services on all nodes; with Oozie in the mix, you don't even have to authenticate...

    But there's a catch: the delegation token does not work for any service that uses pure Kerberos authentication -- i.e. Hive Metastore, Hive JDBC, HBase, ZooKeeper, Oozie, etc.
    That's why Oozie has a workaround: explicit <credential> requests for Hive actions, Hive2 actions, HBase actions, etc. [disclaimer: I don't really know how it actually works]

    I doubt that any of these "credentials" would work against Oozie itself...!


    How you can manage your own custom authentication

    1. build a keytab file with your password inside (cf. Linux command ktutil)
    2. upload that file to HDFS with restricted access -- because anyone who can get access to that file could then login as you!!!
    3. tell Oozie to download the file in the container that runs your Java action, with <file> -- it will be available in the Current Working Dir so you won't have to care about the actual path
    4. create a JAAS config file that explains to Java that "whenever the Oozie REST server requests authentication via SPNEGO, create a Kerberos ticket on-the-fly using this principal, whose password is in that keytab file" (instead of the default which is "look for the ticket cache and get an existing ticket there")
    5. upload that JAAS config file to HDFS, use another <file> etc.
    6. activate that JAAS config with a Java system property

    You will find more details in that post of mine: Error when connect to impala with JDBC under kerberos authrication

    Disclaimer: I don't know which JAAS "subject" is expected by Oozie (for instance, ZooKeeper expects Client, Hive expects com.sun.security.jgss.krb5.initiate)


    Alternative: forget about JAAS and use the cache.

    • set env variable KRB5CCNAME to a temp file in the CWD of the container (which will be destroyed automatically when the job stops)
    • spawn a Linux command kinit -kt myname.keytab myname@REALM which will obtain a Kerberos ticket in the cache defined by KRB5CCNAME
    • and let JAAS follow the default process

    这篇关于使用Kerberos从其他作业的Java操作中提交Oozie作业的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆