在连接到HiveServer2时,impyla挂起 [英] impyla hangs when connecting to HiveServer2

查看:362
本文介绍了在连接到HiveServer2时,impyla挂起的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用Python编写一些ETL流程,在部分流程中使用Hive。根据文档,Cloudera的impyla客户端可与Impala和Hive一起使用。



根据我的经验,客户端为Impala工作,但在尝试连接到Hive时挂起:

  from impala.dbapi import connect 
$ b $ conn = connect(host ='host_running_hs2_service',port = 10000,user ='awoolford',password ='Bzzzzz')
cursor = conn.cursor()< - 挂在这里
cursor.execute('show tables')
results = cursor.fetchall()
打印结果

如果我介入代码,它会在尝试打开会话时挂起(第873行hiveserver2.py )。

起初,我怀疑防火墙端口可能会阻塞连接,所以我尝试使用Java进行连接。令我惊讶的是,这工作:

  public class Main {
private static String driverName =org.apache.hive .jdbc.HiveDriver;
public static void main(String [] args)throws SQLException {
try {
Class.forName(driverName);
} catch(ClassNotFoundException e){
e.printStackTrace();
System.exit(1);
}
连接连接= DriverManager.getConnection(jdbc:hive2:// host_running_hs2_service:10000 / default,awoolford,Bzzzzz);
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(SHOW TABLES);

while(resultSet.next()){
System.out.println(resultSet.getString(1));



$ / code $ / pre

由于Hive和Python都是这样的通常使用的技术,我很想知道是否有其他人遇到过这个问题,如果是的话,你做了什么来解决它?



版本:




  • Hive 1.1.0-cdh5.5.1

  • Python 2.7.11 | Anaconda 2.3.0

  • Redhat 6.7


解决方案

我尝试了Dropbox的 PyHive 包,它完美运行:

  from pyhive import hive 
conn = hive.Connection(host =host_running_hs2_service,port = 10000,username =awoolford)

cursor = conn.cursor()
cursor.execute(SHOW TABLES)
用于cursor.fetchall()中的表格:
打印表格

我不确定为什么Impyla客户端不工作,但至少我们可以向前迈进。


I'm writing some ETL flows in Python that, for part of the process, use Hive. Cloudera's impyla client, according to the documentation, works with both Impala and Hive.

In my experience, the client worked for Impala, but hung when I tried to connect to Hive:

from impala.dbapi import connect

conn = connect(host='host_running_hs2_service', port=10000, user='awoolford', password='Bzzzzz')
cursor = conn.cursor()          <- hangs here
cursor.execute('show tables')
results = cursor.fetchall()
print results

If I step-into the code, it hangs when it tries to open a session (line #873 of hiveserver2.py).

At first, I suspected that a firewall port might be blocking the connection, and so I tried to connect using Java. To my surprise, this worked:

public class Main {
    private static String driverName = "org.apache.hive.jdbc.HiveDriver";
    public static void main(String[] args) throws SQLException {
        try {
            Class.forName(driverName);
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
            System.exit(1);
        }
        Connection connection = DriverManager.getConnection("jdbc:hive2://host_running_hs2_service:10000/default", "awoolford", "Bzzzzz");
        Statement statement = connection.createStatement();
        ResultSet resultSet = statement.executeQuery("SHOW TABLES");

        while (resultSet.next()) {
            System.out.println(resultSet.getString(1));
        }
    }
}

Since Hive and Python are such commonly used technologies, I'm curious to know if anyone else has experienced this problem and, if so, what did you do to fix it?

Versions:

  • Hive 1.1.0-cdh5.5.1
  • Python 2.7.11 | Anaconda 2.3.0
  • Redhat 6.7

解决方案

I tried Dropbox's PyHive package and it worked perfectly:

from pyhive import hive
conn = hive.Connection(host="host_running_hs2_service", port=10000, username="awoolford")

cursor = conn.cursor()
cursor.execute("SHOW TABLES")
for table in cursor.fetchall():
    print table

I'm not sure why the Impyla client didn't work, but at least we can move forward.

这篇关于在连接到HiveServer2时,impyla挂起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆