启用安全性后,运行任何Hadoop命令都会失败. [英] Running any Hadoop command fails after enabling security.

查看:524
本文介绍了启用安全性后,运行任何Hadoop命令都会失败.的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为我的 CDH 4.3 (通过Cloudera Manager)测试平台启用Kerberos.因此,在WebUI中将身份验证从简单"更改为Kerberos后,我无法执行任何hadoop操作,如下所示.反正有明确地指定密钥表吗?

I was trying to enable Kerberos for my CDH 4.3 (via Cloudera Manager) test bed. So after changing authentication from Simple to Kerberos in the WebUI, I'm unable to do any hadoop operations as shown below. Is there anyway to specify the keytab explicitly?

[root@host-dn15 ~]# su - hdfs
-bash-4.1$ hdfs dfs -ls /
13/09/10 08:15:35 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
13/09/10 08:15:35 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
13/09/10 08:15:35 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "host-dn15.hadoop.com/192.168.10.227"; destination host is: "host-dn15.hadoop.com":8020;
-bash-4.1$ kdestroy
-bash-4.1$ kinit
Password for hdfs@HADOOP.COM:
-bash-4.1$ klist
Ticket cache: FILE:/tmp/krb5cc_494
Default principal: hdfs@HADOOP.COM

Valid starting     Expires            Service principal
09/10/13 08:20:31  09/11/13 08:20:31  krbtgt/HADOOP.COM@HADOOP.COM
    renew until 09/10/13 08:20:31

-bash-4.1$ klist -e
Ticket cache: FILE:/tmp/krb5cc_494
Default principal: hdfs@HADOOP.COM

Valid starting     Expires            Service principal
09/10/13 08:20:31  09/11/13 08:20:31  krbtgt/HADOOP.COM@HADOOP.COM
    renew until 09/10/13 08:20:31, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
-bash-4.1$

所以我仔细看了namenode日志,

So I took a good look at the namenode log,

2013-09-10 10:02:06,085 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8022: readAndProcess threw exception javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)] from client 10.132.100.228. Count of bytes read: 0
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)]

JCE策略文件已经安装在所有节点上.

JCE policy files are Installed already on all nodes.

[root@host-dn15 security]# sha256sum ./local_policy.jar
4a5c8f64107c349c662ea688563e5cd07d675255289ab25246a3a46fc4f85767  ./local_policy.jar
[root@host-dn15 security]# sha256sum ./US_export_policy.jar
b800fef6edc0f74560608cecf3775f7a91eb08d6c3417aed81a87c6371726115  ./US_export_policy.jar
[root@host-dn15 security]# sha256sum ./local_policy.jar.bak
7b26d0e16722e5d84062240489dea16acef3ea2053c6ae279933499feae541ab  ./local_policy.jar.bak
[root@host-dn15 security]# sha256sum ./US_export_policy.jar.bak
832133c52ed517df991d69770f97c416d2e9afd874cb4f233a751b23087829a3  ./US_export_policy.jar.bak
[root@host-dn15 security]#

以及领域中的负责人列表.

And the list of principals in the realm.

kadmin:  listprincs
HTTP/host-dn15.hadoop.com@HADOOP.COM
HTTP/host-dn16.hadoop.com@HADOOP.COM
HTTP/host-dn17.hadoop.com@HADOOP.COM
K/M@HADOOP.COM
cloudera-scm/admin@HADOOP.COM
hbase/host-dn15.hadoop.com@HADOOP.COM
hbase/host-dn16.hadoop.com@HADOOP.COM
hbase/host-dn17.hadoop.com@HADOOP.COM
hdfs/host-dn15.hadoop.com@HADOOP.COM
hdfs/host-dn16.hadoop.com@HADOOP.COM
hdfs/host-dn17.hadoop.com@HADOOP.COM
hdfs@HADOOP.COM
hue/host-dn15.hadoop.com@HADOOP.COM
host-dn16/hadoop.com@HADOOP.COM
kadmin/admin@HADOOP.COM
kadmin/changepw@HADOOP.COM
kadmin/host-dn15.hadoop.com@HADOOP.COM
krbtgt/HADOOP.COM@HADOOP.COM
mapred/host-dn15.hadoop.com@HADOOP.COM
mapred/host-dn16.hadoop.com@HADOOP.COM
mapred/host-dn17.hadoop.com@HADOOP.COM
root/admin@HADOOP.COM
root@HADOOP.COM
zookeeper/host-dn15.hadoop.com@HADOOP.COM
kadmin:  exit
[root@host-dn15 ~]#

导出了hdfs的密钥表并用于初始化.

exported the keytab for hdfs and used to kinit.

-bash-4.1$ kinit -kt ./hdfs.keytab hdfs
-bash-4.1$ klist
Ticket cache: FILE:/tmp/krb5cc_494
Default principal: hdfs@HADOOP.COM

Valid starting     Expires            Service principal
09/10/13 09:49:42  09/11/13 09:49:42  krbtgt/HADOOP.COM@HADOOP.COM
    renew until 09/10/13 09:49:42

一切都变得徒劳.有什么想法吗?

Everything went futile. Any idea??

谢谢,

推荐答案

我遇到了一个问题:我有一个Kerberized CDH集群,即使有了有效的Kerberos票证,我也无法从命令行运行任何hadoop命令

I ran into a problem in which I had a Kerberized CDH cluster and even with a valid Kerberos ticket, I couldn't run any hadoop commands from the command line.

注意::写完此答案后,我将其写为博客文章,网址为

NOTE: After writing this answer I wrote it up as a blog post at http://sarastreeter.com/2016/09/26/resolving-hadoop-problems-on-kerberized-cdh-5-x/ . Please share!

因此,即使拥有有效的票证,也会失败:

So even with a valid ticket, this would fail:

$ hadoop fs -ls /

WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

这是我学到的东西,以及最终解决问题的方法.我已尽可能链接到当前版本的Cloudera文档,但某些文档似乎仅适用于旧版本.

Here is what I learned and how I ended up resolving the problem. I have linked to Cloudera doc for the current version where possible, but some of the doc seems to be present only for older versions.

请注意,问题归结为配置问题,但是Kerberos本身和Cloudera Manager均已正确安装.我在寻找答案时遇到的许多问题都归因于Kerberos或Hadoop安装不正确.即使Hadoop和Kerberos都可以运行,但还是出现了我的问题,但是它们没有配置为可以正常工作.

Please note that the problem comes down to a configuration issue but that Kerberos itself and Cloudera Manager were both installed correctly. Many of the problems I ran across while searching for answers came down to Kerberos or Hadoop being installed incorrectly. The problem I had occurred even though both Hadoop and Kerberos were functional, but they were not configured to work together properly.

尝试执行hadoop命令的用户执行klist.

Do a klist from the user you are trying to execute the hadoop command.

$ sudo su - myuser
$ klist

如果您没有票证,它将打印:

If you don't have a ticket, it will print:

klist: Credentials cache file '/tmp/krb5cc_0' not found

如果您尝试在没有票据的情况下执行hadoop命令,则会因设计错误而显示GSS INITIATE FAILED错误:

If you try to do a hadoop command without a ticket you will get the GSS INITIATE FAILED error by design:

WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

换句话说,这不是安装问题.如果是这种情况,请查看:

In other words, that is not an install problem. If this is your situation, take a look at:

  • http://www.roguelynn.com/words/explain-like-im-5-kerberos/
  • For other troubleshooting of Kerberos in general, check out https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/errors.html

Cloudera的默认安装对hadoop命令的执行具有用户和组限制,包括对某些用户的特定禁止(更多内容请参见

A default install of Cloudera has user and group restrictions on execution of hadoop commands, including a specific ban on certain users ( more on page 57 of http://www.cloudera.com/documentation/enterprise/5-6-x/PDF/cloudera-security.pdf ).

有几个属性可以处理此问题,包括将hdfs的超组设置为字符串supergroup而不是hdfs,默认情况下将dfs_permissions enabled属性设置为false(

There are several properties that deal with this, including the supergroup for hdfs being set to the string supergroup instead of hdfs, dfs_permissions enabled property being set to false by default (hadoop user file permissions), users with uid over 1000 being banned.

其中任何一个都可能是一个因素,对我而言,HDFS列在banned.users属性中.

Any of these could be a factor, for me it was HDFS being listed in the banned.users property.

特别是对于用户HDFS,如果要尝试使用它来执行hadoop命令,请确保已从hdfs-site.xml配置的banned.users配置属性中删除了hdfs.

Specifically for user HDFS, make sure you have removed hdfs from the banned.users configuration property in hdfs-site.xml configuration if you are trying to use it to execute hadoop commands.

  1) UNPRIVILEGED USER AND WRITE PERMISSIONS

Cloudera建议的执行Hadoop命令的方法是创建一个非特权用户和匹配的主体,而不是使用hdfs用户.需要注意的是,该用户还需要自己的/user目录,并且可能会在/user目录中遇到写权限错误.如果您的非特权用户在/user中没有目录,则可能会导致写入权限被拒绝"错误.

The Cloudera-recommended way to execute Hadoop commands is to create an unprivileged user and matching principal, instead of using the hdfs user. A gotcha is that this user also needs its own /user directory and can run into write permissions errors with the /user directory. If your unprivileged user does not have a directory in /user, it may result in the WRITE permissions denied error.

另一个相关问题是,Cloudera在非kerberized群集上将dfs.datanode.data.dir设置为750,但在kerberized群集上需要700.如果设置了错误的目录权限,则Kerberos安装将失败.数据节点的端口还必须设置为1024以下的值,对于HTTP端口,建议将其设置为1006,对于数据节点端口,建议将其设置为1004.

Another related issue is that Cloudera sets dfs.datanode.data.dir to 750 on a non-kerberized cluster, but requires 700 on a kerberized cluster. With the wrong dir permissions set, the Kerberos install will fail. The ports for the datanodes must also be set to values below 1024, which are recommended as 1006 for the HTTP port and 1004 for the Datanode port.

http://www.cloudera.com /documentation/enterprise/5-6-x/topics/cdh_ig_hdfs_cluster_deploy.html

http://www.cloudera.com/documentation/archive/manager/4-x/4-7-2/Configuring-Hadoop-Security-with-Cloudera-Manager/cmchs_enable_security_s9.html

  3) SERVICE-SPECIFIC CONFIGURATION TASKS 

在安全文档的第60页上,有一些步骤可对Hadoop服务进行内核化.确保您做了这些!

On page 60 of the security doc, there are steps to kerberize Hadoop services. Make sure you did these!

$ sudo -u hdfs hadoop fs -chown mapred:hadoop ${mapred.system.dir}

HBase

$ sudo -u hdfs hadoop fs -chown -R hbase ${hbase.rootdir}

配置单元

$ sudo -u hdfs hadoop fs -chown hive /user/hive

$ rm -rf ${yarn.nodemanager.local-dirs}/usercache/*

除了YARN之外,所有这些步骤都可以随时发生. YARN的步骤必须在安装Kerberos之后进行,因为它正在执行的操作是删除未使用Kerberos的YARN数据的用户缓存.在安装Kerberos之后运行mapreduce时,应使用以Kerberized用户缓存的数据填充.

All of these steps EXCEPT for the YARN one can happen at any time. The step for YARN must happen after Kerberos installation because what it is doing is removing the user cache for non-kerberized YARN data. When you run mapreduce after the Kerberos install it should populate this with the Kerberized user cache data.

YARN应用程序以exitCode退出:-1000无法初始化用户目录

  1) SHORT NAME RULES MAPPING

Kerberos主体被映射"到OS级服务用户.例如,仅因为在Hadoop核心站点中设置了名称映射规则,hdfs/WHATEVER @ REALM才在操作系统中映射到服务用户"hdfs".如果没有名称映射,则Hadoop将不知道哪个用户通过哪个主体进行了身份验证.

Kerberos principals are "mapped" to the OS-level services users. For example, hdfs/WHATEVER@REALM maps to the service user 'hdfs' in your operating system only because of a name mapping rule set in the core-site of Hadoop. Without name mapping, Hadoop wouldn't know which user is authenticated by which principal.

如果使用的主体应映射到hdfs,请确保根据这些Hadoop规则将主体名称正确解析为hdfs.

If you are using a principal that should map to hdfs, make sure the principal name resolves correctly to hdfs according to these Hadoop rules.

(默认情况下具有名称映射规则)

(has a name mapping rule by default)

  • hdfs @ REALM
  • hdfs/_HOST @ REALM

(默认情况下,没有名称映射规则)

(no name mapping rule by default)

  • hdfs-TAG @ REALM

除非添加规则以适应该问题,否则不良"示例将不起作用

The "bad" example will not work unless you add a rule to accommodate it

http://www.cloudera.com/documentation/archive/cdh/4-x/4-5-0/CDH4-Security-Guide/cdh4sg_topic_19.html )

  2) KEYTAB AND PRINCIPAL KEY VERSION NUMBERS MUST MATCH

密钥版本号(KVNO)是正在使用的密钥的版本(就像您有房门钥匙,然后又改变了门上的锁,以便使用新的密钥一样,旧的密钥不再可用)任何好处).密钥表和主体均具有KVNO,并且版本号必须匹配.

The Key Version Number (KVNO) is the version of the key that is actively being used (as if you had a house key but then changed the lock on the door so it used a new key, the old one is no longer any good). Both the keytab and principal have a KVNO and the version number must match.

默认情况下,当您使用ktadd或xst将主体导出到密钥表时,它会更改密钥表版本号,但不会更改主体的KVNO.这样一来,您最终可能会意外地造成不匹配.

By default, when you use ktadd or xst to export the principal to a keytab, it changes the keytab version number, but does not change the KVNO of the principal. So you can end up accidentally creating a mismatch.

在将主体导出到密钥表时,请将-norandkeykadminkadmin.local一起使用,以避免更新密钥表编号和创建KVNO不匹配.

Use -norandkey with kadmin or kadmin.local when exporting a principal to a keytab to avoid updating the keytab number and creating a KVNO mismatch.

通常,每当主体出现身份验证问题时,请确保检查主体和keytab的KVNO是否匹配:

In general, whenever having principal issues authentication issues, make sure to check that the KVNO of the principal and keytab match:

$ kadmin.local -q 'getprinc myprincipalname'

密钥标签

Keytab

$ klist -kte mykeytab

创建校长

http://www.cloudera.com/documentation/archive/cdh/4-x/4-3-0/CDH4-Security-Guide/cdh4sg_topic_3_4.html

  1) JAVA VERSION MISMATCH WITH JCE JARS

Hadoop需要安装Java安全性JCE无限强度jar才能使用Kerberos的AES-256加密. Hadoop和Kerberos都必须有权访问这些jar.这是一个安装问题,但很容易遗漏,因为您可以认为确实安装了安全罐.

Hadoop needs the Java security JCE Unlimited Strength jars installed in order to use AES-256 encryption with Kerberos. Both Hadoop and Kerberos need to have access to these jars. This is an install issue but it is easy to miss because you can think you have the security jars installed when you really don't.

  • 罐子是正确的版本-正确的安全罐与Java捆绑在一起,但是如果在事实之后安装它们,则必须确保罐子的版本与Java的版本相对应,否则您将继续出错.要进行故障排除,请针对Kerberos服务器上使用的JDK的全新下载中的md5sum哈希与md5sum哈希进行比较.
  • 罐子放在正确的位置$JAVA_HOME/jre/lib/security
  • Hadoop已配置为在正确的位置查找它们.检查在/etc/hadoop/conf/hadoop-env.sh
  • 中是否存在将$JAVA_HOME导出到正确的Java安装位置的语句
  • the jars are the right version - the correct security jars are bundled with Java, but if you install them after the fact you have to make sure the version of the jars corresponds to the version of Java or you will continue to get errors. To troubleshoot, check the md5sum hash from a brand new download of the JDK that you're using in against the md5sum hash of the ones on the Kerberos server.
  • the jars are in the right location $JAVA_HOME/jre/lib/security
  • Hadoop is configured to look for them in the right place. Check if there is an export statement for $JAVA_HOME to the correct Java install location in /etc/hadoop/conf/hadoop-env.sh

如果Hadoop的JAVA_HOME设置不正确,它将失败并显示"GSS INITIATE FAILED".如果罐子不在正确的位置,则Kerberos将找不到它们,并且会给出一个错误,提示它不支持AES-256加密类型(UNSUPPORTED ENCTYPE).

If Hadoop has JAVA_HOME set incorrectly it will fail with "GSS INITIATE FAILED". If the jars are not in the right location, Kerberos won't find them and will give an error that it doesn't support the AES-256 encryption type (UNSUPPORTED ENCTYPE).

http://www.cloudera.com /documentation/enterprise/5-5-x/topics/cm_sg_s2_jce_policy.html

https://community.cloudera.com/t5/Cloudera-Manager-Installation/Problem-with-Kerberos-amp-user-hdfs/td-p/6809

Cloudera的问题记录在

Cloudera has an issue documented at http://www.cloudera.com/documentation/archive/cdh/3-x/3u6/CDH3-Security-Guide/cdh3sg_topic_14_2.html in which tickets must be renewed before hadoop commands can be issued. This only happens with Oracle JDK 6 Update 26 or earlier and package version 1.8.1 or higher of the MIT Kerberos distribution.

要检查软件包,请在CentOS/RHEL上执行rpm -qa | grep krb5或在Debian/Ubuntu上执行aptitude search krb5 -F "%c %p %d %V".

To check the package, do an rpm -qa | grep krb5 on CentOS/RHEL or aptitude search krb5 -F "%c %p %d %V" on Debian/Ubuntu.

因此,请像平常一样执行常规kinit,然后执行kinit -R来强制更新票证.

So do a regular kinit as you would, then do a kinit -R to force the ticket to be renewed.

$ kinit -kt mykeytab myprincipal
$ kinit -R

最后,我实际上遇到的问题在任何地方都找不到,...

And finally, the issue I actually had which I could not find documented anywhere ...

对于Kerberos,有两个重要的配置文件:krb5.conf和kdc.conf.这些是krb5kdc服务和KDC数据库的配置.我的问题是krb5.conf文件具有一个属性: default_ccache_name = KEYRING:persistent:%{uid}.

There are two important configuration files for Kerberos, the krb5.conf and the kdc.conf. These are configurations for the krb5kdc service and the KDC database. My problem was the krb5.conf file had a property: default_ccache_name = KEYRING:persistent:%{uid}.

这将我的缓存名称设置为KEYRING:永久和用户uid(解释为

This set my cache name to KEYRING:persistent and user uid (explained https://web.mit.edu/kerberos/krb5-1.13/doc/basic/ccache_def.html). When I did a kinit, it created the ticket in /tmp because the cache name was being set elsewhere as /tmp. Cloudera services obtain authentication with files generated at runtime in /var/run/cloudera-scm-agent/process , and these all export the cache name environment variable (KRB5CCNAME) before doing their kinit. That's why Cloudera could obtain tickets but my hadoop user couldn't.

解决方案是从krb5.conf中删除设置default_ccache_name的行,并允许kinit将凭据存储在/tmp中,这是MIT Kerberos默认值DEFCCNAME(在

The solution was to remove the line from krb5.conf that set default_ccache_name and allow kinit to store credentials in /tmp, which is the MIT Kerberos default value DEFCCNAME (documented at https://web.mit.edu/kerberos/krb5-1.13/doc/mitK5defaults.html#paths).

https://www.cloudera.com /documentation/enterprise/5-6-x/topics/cm_sg_intro_kerb.html .

http://www.cloudera .com/documentation/enterprise/5-6-x/PDF/cloudera-security.pdf ,从第48页开​​始.

http://www.cloudera.com/documentation/enterprise/5-6-x/PDF/cloudera-security.pdf, starting on page 48 .

这篇关于启用安全性后,运行任何Hadoop命令都会失败.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆