如何使用python pyhs2连接到hive? [英] How to connect to hive using python pyhs2?

查看:64
本文介绍了如何使用python pyhs2连接到hive?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 pyhs2 访问 hive.我尝试了以下代码:

example.py

导入pyhs2conn = pyhs2.connect(host='localhost', port=10000,authMechanism=None, user=None, password=None,database='default')使用 conn.cursor() 作为 cur:cur.execute("从表中选择*")对于 cur.fetch() 中的 i:打印我

我收到以下错误:

 回溯(最近一次调用最后一次):文件example.py",第 2 行,在 <module> 中conn = pyhs2.connect(host='localhost', port=10000,authMechanism=None, user=None, password=None,database='default')文件build/bdist.linux-x86_64/egg/pyhs2/__init__.py",第7行,连接文件build/bdist.linux-x86_64/egg/pyhs2/connections.py",第 46 行,在 __init__文件build/bdist.linux-x86_64/egg/pyhs2/cloudera/thrift_sasl.py",第55行,打开文件build/bdist.linux-x86_64/egg/thrift/transport/TSocket.py",第101行,打开thrift.transport.TTransport.TTransportException: 无法连接到 localhost:10000

我在尝试使用 hive utils 时遇到了确切的错误.我已经检查了 sasl 安装.我是否需要对 hive 中的 hive-site.xml 进行任何更改?如果是,我需要在哪里创建它?我错过了什么吗?

解决方案

1- 使用(在 Linux 上)找出本地主机的 IP 地址:

主机名 -I

2- 将 localhost 更改为实际 ip

我还建议您仔细检查 Hive 所在的主机.如果您使用 hortonworks,请在 Ambari 上转到 Hive,然后转到 Configs 并检查那里的主机.

编辑(添加另一个建议):

您的用户名和密码很可能不是None.要获取您的用户名和密码,请检查 hive-site.xml 并查看 javax.jdo.option.ConnectionUserNamejavax.jdo.option 中的值.连接密码.如果找不到任何内容,请尝试使用空字符串作为密码(而不是 None),并将 hive 或空字符串作为用户名,即一一尝试:

conn = pyhs2.connect(host='localhost', port=10000,authMechanism='PLAIN', user='hive', password='',database='default')

conn = pyhs2.connect(host='localhost', port=10000,authMechanism='PLAIN', user='', password='',database='default')>

请注意,我也将 authMechanism 更改为 "PLAIN"

I am trying to access hive using pyhs2. I tried the following code:

example.py

import pyhs2
conn = pyhs2.connect(host='localhost', port=10000,authMechanism=None, user=None, password=None,database='default')
with conn.cursor() as cur:
        cur.execute("select * from table")
        for i in cur.fetch():
            print i

I am getting the following error:

    Traceback (most recent call last):
 File "example.py", line 2, in <module> conn = pyhs2.connect(host='localhost', port=10000,authMechanism=None, user=None, password=None,database='default')
      File "build/bdist.linux-x86_64/egg/pyhs2/__init__.py", line 7, in connect
      File "build/bdist.linux-x86_64/egg/pyhs2/connections.py", line 46, in __init__
      File "build/bdist.linux-x86_64/egg/pyhs2/cloudera/thrift_sasl.py", line 55, in open
      File "build/bdist.linux-x86_64/egg/thrift/transport/TSocket.py", line 101, in open
    thrift.transport.TTransport.TTransportException: Could not connect to localhost:10000

I am getting the exact error when I try with hive utils. I have checked sasl installation. Do I need to make any changes to the hive-site.xml in hive? If yes where do I need to create it? Am I missing out something?

解决方案

1- Figure out the IP address of the localhost using (on Linux):

hostname -I

2- Change localhost to the actual ip

I would also suggest that you double check which host Hive is on. If you are using hortonworks, on Ambari, go to Hive, then Configs and check the host there.

Edit (adding another suggestion):

Your username and password most likely aren't None. To get your username and password, check hive-site.xml and look at the values in javax.jdo.option.ConnectionUserName and javax.jdo.option.ConnectionPassword. If you can't find anything, try an empty string as the password (as opposed to None), and hive or empty string as the username i.e. try these one by one:

conn = pyhs2.connect(host='localhost', port=10000,authMechanism='PLAIN', user='hive', password='',database='default')

conn = pyhs2.connect(host='localhost', port=10000,authMechanism='PLAIN', user='', password='',database='default')

Note that I also changed authMechanism to "PLAIN"

这篇关于如何使用python pyhs2连接到hive?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆