Python3 连接到 Kerberos Hbase 节俭 HTTPS [英] Python3 connection to Kerberos Hbase thrift HTTPS
问题描述
我们有 Python3 应用程序可以连接到 Hbase 并获取数据.
连接在 Kerberos Hbase Thrift 二进制协议(在 TSocket 中)运行良好,直到 Hadoop 团队将 Hadoop 系统移至 Cloudera 和 Cloudera 管理器,后者以 HTTPS 模式启动 Kerberos Hbase Thrift.
现在协议从 TSocket 更改为 HTTP/HTPS,Python 代码无法使用带有 SASL kerberos 的 HTTP 客户端进行身份验证.
Python 3.6.8 中使用的当前 Python 版本
和包版本是
节俭=0.13.0
hbase-thrift=0.20.4
pure_sasl=0.5.1
TSocket 模式下的工作代码:
############
from thrift.transport import TSocket,TTransport从 thrift.protocol 导入 TBinaryProtocol从 hbase 导入 Hbase从 hbase.ttypes 导入 *导入 jprops从子进程导入调用,check_output#read cluster.properties使用 open('/data/properties/cluster.properties') 作为 fp:属性 = jprops.load_properties(fp)# kerberos 票kerberos_ticket():主体 = 属性[主体"]kinitCommand = "kinit";+ ""+ "-kt"+ ""+ keyTab + ""+ 校长调用(kinitCommand,shell =真")返回#Hbase连接def hbase_connection():#获取hbase数据thriftHost = 属性[thriftHost"]hbaseService = 属性[hbaseService"]Tsock = TSocket.TSocket(thriftHost, 9090)Tsock.setTimeout(2000000) #毫秒超时运输 = TTransport.TSaslClientTransport(卓克,主机=节俭主机,服务=hbaseService,机制='GSSAPI')协议 = TBinaryProtocol.TBinaryProtocol(传输)客户端 = Hbase.Client(协议)返回客户,运输#获取kerberized票据kerberos_ticket()客户端,传输 = hbase_connection()运输.open()打印(client.getTableNames())
###########
我发现在 TTransport.py 代码中有一个注释它只支持 TSocket
https://github.com/apache/thrift/blob/master/lib/py/src/transport/TTransport.py
TTransport.TSaslClientTransport
传输:要使用的底层传输,通常只是一个 TSocket"
我们尝试使用
https://github.com/apache/thrift/blob/master/lib/py/src/transport/THttpClient.py
THttpClient.THttpClient(url)
但它不能用于 SASL kerberos 的 TTransport.TSaslClientTransport.
请帮助建议是否不能在 Cloudera 托管的 Kerberos Hbase 节俭 HTTPS 中使用 Python 以及使用 Python 连接 Hbase (Kerberos) 的任何替代方法.
PS:我通过这个链接遇到了类似的问题,但没有具体的解决方案
要连接到的 Python 程序Http模式下通过thrift服务器的HBase
提前致谢,
曼吉尔
我找到了在 hbase http 模式下使用 kerberos 的解决方案.
客户端python包
- 六个 1.15.0
- 节俭 0.13.0
- hbase-thrift 0.20.4
- pykerberos 1.2.1
谢谢,
曼吉尔
We have Python3 application to connect to Hbase and fetch data.
The connectivity was working fine with Kerberos Hbase Thrift Binary protocol (in TSocket) until the Hadoop team moved the Hadoop system to Cloudera and Cloudera manager which start Kerberos Hbase Thrift in HTTPS mode.
Now the protocol changed from TSocket to HTTP/HTPS and Python code cannot authenticate using HTTP Client with SASL kerberos.
Current Python version used ins Python 3.6.8
and package versions are
thrift=0.13.0
hbase-thrift=0.20.4
pure_sasl=0.5.1
Working code in TSocket mode:
############
from thrift.transport import TSocket,TTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase
from hbase.ttypes import *
import jprops
from subprocess import call, check_output
#read cluster.properties
with open('/data/properties/cluster.properties') as fp:
properties = jprops.load_properties(fp)
# kerberos ticket
kerberos_ticket():
principal = properties["principal"]
kinitCommand = "kinit" + " " + "-kt"+ " " + keyTab + " " + principal
call(kinitCommand, shell="True")
return
# Hbase connection
def hbase_connection():
#get hbase data
thriftHost = properties["thriftHost"]
hbaseService = properties["hbaseService"]
Tsock = TSocket.TSocket(thriftHost, 9090)
Tsock.setTimeout(2000000) #Milliseconds timeout
transport = TTransport.TSaslClientTransport(
Tsock,
host=thriftHost,
service=hbaseService,
mechanism='GSSAPI'
)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)
return client,transport
#get kerberized ticket
kerberos_ticket()
client,transport = hbase_connection()
transport.open()
print(client.getTableNames())
###########
I found that in the TTransport.py code there was a comment it just supports TSocket
https://github.com/apache/thrift/blob/master/lib/py/src/transport/TTransport.py
TTransport.TSaslClientTransport
"transport: an underlying transport to use, typically just a TSocket"
We tried to use
https://github.com/apache/thrift/blob/master/lib/py/src/transport/THttpClient.py
THttpClient.THttpClient(url)
but it cannot be used in TTransport.TSaslClientTransport for SASL kerberos.
Please help to suggest if Python cannot be used in Cloudera managed Kerberos Hbase thrift HTTPS and any alternative method to connect Hbase (Kerberos) using Python.
PS: I went through this link with a similar issue but had no concrete solution
Python program to connect to HBase via thrift server in Http mode
Thanks in advance,
Manjil
I have found the solution to use kerberos with hbase http mode.
Client side python packages
- six 1.15.0
- thrift 0.13.0
- hbase-thrift 0.20.4
- pykerberos 1.2.1
### Python code
# Prerequsite kinit and kerberos ticket is available for the user
# Hbase thrift running in http protocol secure mode
# Python code to use local kerberos ticket local cache
# add kerberos context in http header
# perform hbase client operation like get table , table scan etc
#
# Important: the httpClient transport opened session will be available only for one time call,
# for next hbs operation need get new kerberos context (krb_context) by adding header and open session
##
import kerberos
from thrift import Thrift
from thrift.transport import THttpClient
from thrift.protocol import TBinaryProtocol
from hbase.Hbase import Client
import ssl
def kerberos_auth():
hbaseService="<hbase>/<HOST>@<DOMAIN.COM>"
#service can hbase ot HTTP based on hbase thrift configuration
clientPrincipal="<user>@<DOMAIN.COM>"
__, krb_context = kerberos.authGSSClientInit(hbaseService, principal=clientPrincipal)
kerberos.authGSSClientStep(krb_context, "")
negotiate_details = kerberos.authGSSClientResponse(krb_context)
headers = {'Authorization': 'Negotiate ' + negotiate_details,'Content-Type':'application/binary'}
return headers
httpClient = THttpClient.THttpClient('https://<THRIFT_HOST>:9090/', cert_file='<client cert file path>.crt',key_file='<client cert key file path>.key', ssl_context=ssl._create_unverified_context())
# if no ssl verification is required
# for new session start
httpClient.setCustomHeaders(headers=kerberos_auth())
protocol = TBinaryProtocol.TBinaryProtocol(httpClient)
httpClient.open()
client = Client(protocol)
# for new session end
client.getTableNames()
Thank you,
Manjil
这篇关于Python3 连接到 Kerberos Hbase 节俭 HTTPS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!