Python3 连接到 Kerberos Hbase 节俭 HTTPS [英] Python3 connection to Kerberos Hbase thrift HTTPS

查看:63
本文介绍了Python3 连接到 Kerberos Hbase 节俭 HTTPS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有 Python3 应用程序可以连接到 Hbase 并获取数据.

连接在 Kerberos Hbase Thrift 二进制协议(在 TSocket 中)运行良好,直到 Hadoop 团队将 Hadoop 系统移至 Cloudera 和 Cloudera 管理器,后者以 HTTPS 模式启动 Kerberos Hbase Thrift.

现在协议从 TSocket 更改为 HTTP/HTPS,Python 代码无法使用带有 SASL kerberos 的 HTTP 客户端进行身份验证.

Python 3.6.8 中使用的当前 Python 版本

和包版本是

节俭=0.13.0

hbase-thrift=0.20.4

pure_sasl=0.5.1

TSocket 模式下的工作代码:

############

from thrift.transport import TSocket,TTransport从 thrift.protocol 导入 TBinaryProtocol从 hbase 导入 Hbase从 hbase.ttypes 导入 *导入 jprops从子进程导入调用,check_output#read cluster.properties使用 open('/data/properties/cluster.properties') 作为 fp:属性 = jprops.load_properties(fp)# kerberos 票kerberos_ticket():主体 = 属性[主体"]kinitCommand = "kinit";+ ""+ "-kt"+ ""+ keyTab + ""+ 校长调用(kinitCommand,shell =真")返回#Hbase连接def hbase_connection():#获取hbase数据thriftHost = 属性[thriftHost"]hbaseService = 属性[hbaseService"]Tsock = TSocket.TSocket(thriftHost, 9090)Tsock.setTimeout(2000000) #毫秒超时运输 = TTransport.TSaslClientTransport(卓克,主机=节俭主机,服务=hbaseService,机制='GSSAPI')协议 = TBinaryProtocol.TBinaryProtocol(传输)客户端 = Hbase.Client(协议)返回客户,运输#获取kerberized票据kerberos_ticket()客户端,传输 = hbase_connection()运输.open()打印(client.getTableNames())

###########

我发现在 TTransport.py 代码中有一个注释它只支持 TSocket

https://github.com/apache/thrift/blob/master/lib/py/src/transport/TTransport.py

TTransport.TSaslClientTransport

传输:要使用的底层传输,通常只是一个 TSocket"

我们尝试使用

https://github.com/apache/thrift/blob/master/lib/py/src/transport/THttpClient.py

THttpClient.THttpClient(url)但它不能用于 SASL kerberos 的 TTransport.TSaslClientTransport.

请帮助建议是否不能在 Cloudera 托管的 Kerberos Hbase 节俭 HTTPS 中使用 Python 以及使用 Python 连接 Hbase (Kerberos) 的任何替代方法.

PS:我通过这个链接遇到了类似的问题,但没有具体的解决方案

要连接到的 Python 程序Http模式下通过thrift服务器的HBase

提前致谢,

曼吉尔

解决方案

我找到了在 hbase http 模式下使用 kerberos 的解决方案.

客户端python包

  • 六个 1.15.0
  • 节俭 0.13.0
  • hbase-thrift 0.20.4
  • pykerberos 1.2.1
<预><代码>### Python 代码# 先决条件 kinit 和 kerberos 票对用户可用# Hbase thrift 运行在http协议安全模式# 使用本地 kerberos 票证本地缓存的 Python 代码# 在 http 头中添加 kerberos 上下文# 执行 hbase 客户端操作,如获取表、表扫描等## 重要提示:httpClient 传输打开的会话仅可用于一次调用,# 下一个 hbs 操作需要通过添加头和打开会话来获取新的 kerberos 上下文 (krb_context)##导入 Kerberos从节俭进口节俭从 thrift.transport 导入 THttpClient从 thrift.protocol 导入 TBinaryProtocol从 hbase.Hbase 导入客户端导入 ssldef kerberos_auth():hbaseService="<hbase>/<HOST>@<DOMAIN.COM>";#service可以基于hbase thrift配置hbase ot HTTPclientPrincipal="@";__, krb_context = kerberos.authGSSClientInit(hbaseService, principal=clientPrincipal)kerberos.authGSSClientStep(krb_context, "")协商细节= kerberos.authGSSClientResponse(krb_context)headers = {'Authorization':'Negotiate'+negotiate_details,'Content-Type':'application/binary'}返回标题httpClient = THttpClient.THttpClient('https://:9090/', cert_file='<客户端证书文件路径>.crt',key_file='<客户端证书密钥文件路径>.key', ssl_context=ssl._create_unverified_context())# 如果不需要ssl验证# 新会话开始httpClient.setCustomHeaders(headers=kerberos_auth())协议 = TBinaryProtocol.TBinaryProtocol(httpClient)httpClient.open()客户端 = 客户端(协议)# 新会话结束客户端.getTableNames()

谢谢,
曼吉尔

We have Python3 application to connect to Hbase and fetch data.

The connectivity was working fine with Kerberos Hbase Thrift Binary protocol (in TSocket) until the Hadoop team moved the Hadoop system to Cloudera and Cloudera manager which start Kerberos Hbase Thrift in HTTPS mode.

Now the protocol changed from TSocket to HTTP/HTPS and Python code cannot authenticate using HTTP Client with SASL kerberos.

Current Python version used ins Python 3.6.8

and package versions are

thrift=0.13.0

hbase-thrift=0.20.4

pure_sasl=0.5.1

Working code in TSocket mode:

############

from thrift.transport import TSocket,TTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase
from hbase.ttypes import *
import jprops
from subprocess import call, check_output

#read cluster.properties
with open('/data/properties/cluster.properties') as fp:
properties = jprops.load_properties(fp)


# kerberos ticket
kerberos_ticket():
principal = properties["principal"]
kinitCommand = "kinit" + " " + "-kt"+ " " + keyTab + " " + principal
call(kinitCommand, shell="True")
return

# Hbase connection
def hbase_connection():
#get hbase data
thriftHost = properties["thriftHost"]
hbaseService = properties["hbaseService"]
Tsock = TSocket.TSocket(thriftHost, 9090)
Tsock.setTimeout(2000000) #Milliseconds timeout
transport = TTransport.TSaslClientTransport(
Tsock,
host=thriftHost,
service=hbaseService,
mechanism='GSSAPI'
)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)
return client,transport

#get kerberized ticket
kerberos_ticket()

client,transport = hbase_connection()
transport.open()

print(client.getTableNames())

###########

I found that in the TTransport.py code there was a comment it just supports TSocket

https://github.com/apache/thrift/blob/master/lib/py/src/transport/TTransport.py

TTransport.TSaslClientTransport

"transport: an underlying transport to use, typically just a TSocket"

We tried to use

https://github.com/apache/thrift/blob/master/lib/py/src/transport/THttpClient.py

THttpClient.THttpClient(url) but it cannot be used in TTransport.TSaslClientTransport for SASL kerberos.

Please help to suggest if Python cannot be used in Cloudera managed Kerberos Hbase thrift HTTPS and any alternative method to connect Hbase (Kerberos) using Python.

PS: I went through this link with a similar issue but had no concrete solution

Python program to connect to HBase via thrift server in Http mode

Thanks in advance,

Manjil

解决方案

I have found the solution to use kerberos with hbase http mode.

Client side python packages

  • six 1.15.0
  • thrift 0.13.0
  • hbase-thrift 0.20.4
  • pykerberos 1.2.1


### Python code
# Prerequsite kinit and kerberos ticket is available for the user
# Hbase thrift running in http protocol secure mode
# Python code to use local kerberos ticket local cache 
# add kerberos context in http header 
# perform hbase client operation like get table , table scan etc
# 
# Important: the httpClient transport opened session will be available only for one time call,
#            for next hbs operation need get new kerberos context (krb_context) by adding header and open session
##


import kerberos
from thrift import Thrift
from thrift.transport import THttpClient
from thrift.protocol import TBinaryProtocol
from hbase.Hbase import Client
import ssl


def kerberos_auth():
    hbaseService="<hbase>/<HOST>@<DOMAIN.COM>"
    #service can hbase ot HTTP based on hbase thrift configuration
    clientPrincipal="<user>@<DOMAIN.COM>"
    __, krb_context = kerberos.authGSSClientInit(hbaseService, principal=clientPrincipal)
    kerberos.authGSSClientStep(krb_context, "")
    negotiate_details = kerberos.authGSSClientResponse(krb_context)
    headers = {'Authorization': 'Negotiate ' + negotiate_details,'Content-Type':'application/binary'}
    return headers

httpClient =  THttpClient.THttpClient('https://<THRIFT_HOST>:9090/', cert_file='<client cert file path>.crt',key_file='<client cert key file path>.key', ssl_context=ssl._create_unverified_context())
# if no ssl verification is required 
# for new session start
httpClient.setCustomHeaders(headers=kerberos_auth())
protocol = TBinaryProtocol.TBinaryProtocol(httpClient)
httpClient.open()
client = Client(protocol)
# for new session end
client.getTableNames()

Thank you,
Manjil

这篇关于Python3 连接到 Kerberos Hbase 节俭 HTTPS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆