使用python模块impyla连接到Kerberized hadoop集群 [英] Connecting to Kerberized hadoop cluster using python module impyla

查看:484
本文介绍了使用python模块impyla连接到Kerberized hadoop集群的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用impyla模块连接到kerberized hadoop群集。我想访问

hiveserver2 / hive,但是我得到了下面的错误:
$ b

test_conn.py



  from impala.dbapi import connect 
import os
connection_string ='hdp296m1.XXX.XXX.com'
conn = connect(host ='connectuser',port = 21050,auth_mechanism =GSSAPI,kerberos_service_name ='testuser @ Myrealm.COM',password ='testuser')
cursor = conn.cursor()
cursor.execute (*)form t_all_types_simple_t')
print cursor.description
results = cursor.fetchall()

Stacktrace:

  [vagrant @ localhost vagrant] $ python test_conn.py 
Traceback最近调用最后一次):
在< module>文件中的第4行test_conn.py
conn = connect(host = connection_string,port = 21050,auth_mechanism =GSSAPI,kerberos_service_name ='testuser @ Myrealm.COM',password ='testuser')
文件/ usr / lib / python2 .7 / site-packages / impala / dbapi.py,第147行,连接
auth_mechanism = auth_mechanism)
文件/usr/lib/python2.7/site-packages/impala/hiveserver2。 py,第758行,连接
transport.open()
文件/usr/lib/python2.7/site-packages/thrift_sasl/__init__.py,第61行,打开
self._trans.open()
打开
message = message文件/usr/lib64/python2.7/site-packages/thrift/transport/TSocket.py,第101行)
thrift.transport.TTransport.TTransportException:无法连接到hdp296m1.XXX.XXX.com:21050

testuser是我将用于执行kinit的kerberos主体。 您的连接似乎不正确。 。试试,从impala.dbapi导入*

$ b

  $ b import sys,os 
#设置你的参数
host = os.environ.get(CDH_HIVE,'xxxx')
port = os.environ.get(CDH_HIVE_port ,'10000')
auth_mechanism = os.environ.get(CDH_auth,'GSSAPI')
user ='hive'
db ='mydb'
#无密码使用kinit
password =''
#hive是krb的主体
kbservice ='hive'

类Hive:

def __init__ (self,db):
self.database = db
self .__ conn = connect(host = host,$ b $ port = port,
auth_mechanism = auth_mechanism,
user = user,
password = password,
database = db,
kerberos_service_name = kbservice



self .__ cursor = self .__ conn .cursor()


h = Hive(db)


I am using impyla module to connect to kerberized hadoop cluster. I want to access
hiveserver2/hive but I was getting the below error:

test_conn.py

from impala.dbapi import connect
import os
connection_string = 'hdp296m1.XXX.XXX.com'
conn = connect(host=connection_string, port=21050,auth_mechanism="GSSAPI",kerberos_service_name='testuser@Myrealm.COM',password='testuser')
cursor = conn.cursor()
cursor.execute('select count(*) form t_all_types_simple_t')
print cursor.description
results = cursor.fetchall()

Stacktrace:

[vagrant@localhost vagrant]$ python test_conn.py
Traceback (most recent call last):
  File "test_conn.py", line 4, in <module>
    conn = connect(host=connection_string, port=21050, auth_mechanism="GSSAPI",kerberos_service_name='testuser@Myrealm.COM',password='testuser')
  File "/usr/lib/python2.7/site-packages/impala/dbapi.py", line 147, in connect
    auth_mechanism=auth_mechanism)
  File "/usr/lib/python2.7/site-packages/impala/hiveserver2.py", line 758, in connect
    transport.open()
  File "/usr/lib/python2.7/site-packages/thrift_sasl/__init__.py", line 61, in open
    self._trans.open()
  File "/usr/lib64/python2.7/site-packages/thrift/transport/TSocket.py", line 101, in open
    message=message)
thrift.transport.TTransport.TTransportException: Could not connect to hdp296m1.XXX.XXX.com:21050

testuser is my kerberos principal which I will be using to do kinit.

解决方案

Your connection appears to be incorrect.. Try,

from impala.dbapi import *
import sys, os
# set your parms
host=os.environ.get("CDH_HIVE",'x.x.x.x')
port=os.environ.get("CDH_HIVE_port",'10000')
auth_mechanism=os.environ.get("CDH_auth",'GSSAPI')
user='hive' 
db='mydb' 
# No password use kinit 
password=''
# hive is principal with krb
kbservice='hive'  

class Hive:

    def __init__(self,db):
        self.database=db
        self.__conn = connect(host=host,
                            port=port,
                            auth_mechanism=auth_mechanism,
                            user=user,
                            password=password,
                            database=db,
                            kerberos_service_name=kbservice
                            )


        self.__cursor = self.__conn.cursor()


h = Hive(db)

这篇关于使用python模块impyla连接到Kerberized hadoop集群的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆