使用 Hiveserver2 Thrift Java 客户端时请求挂起 [英] Requests hang when using Hiveserver2 Thrift Java client

查看:31
本文介绍了使用 Hiveserver2 Thrift Java 客户端时请求挂起的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是这个问题的后续问题,我在那里问什么 Hiveserver 2 thrift java 客户端API 是.如果您不需要更多背景信息,这个问题应该可以在没有该背景的情况下成立.

This is a follow up question to this question where I ask what the Hiveserver 2 thrift java client API is. This question should be able to stand along without that background if you don't need any more context.

找不到关于如何使用 hiverserver2 thrift api 的任何文档,我把它放在一起.我能找到的最好的参考是 Apache JDBC 实现.

Unable to find any documentation on how to use the hiverserver2 thrift api, I put this together. The best reference I could find was the Apache JDBC implementation.

TSocket transport = new TSocket("hive.example.com", 10002);

transport.setTimeout(999999999);
TBinaryProtocol protocol = new TBinaryProtocol(transport);
TCLIService.Client client = new TCLIService.Client(protocol);  

transport.open();
TOpenSessionReq openReq = new TOpenSessionReq();
TOpenSessionResp openResp = client.OpenSession(openReq);
TSessionHandle sessHandle = openResp.getSessionHandle();

TExecuteStatementReq execReq = new TExecuteStatementReq(sessHandle, "SHOW TABLES");
TExecuteStatementResp execResp = client.ExecuteStatement(execReq);
TOperationHandle stmtHandle = execResp.getOperationHandle();

TFetchResultsReq fetchReq = new TFetchResultsReq(stmtHandle, TFetchOrientation.FETCH_FIRST, 1);
TFetchResultsResp resultsResp = client.FetchResults(fetchReq);

TRowSet resultsSet = resultsResp.getResults();
List<TRow> resultRows = resultsSet.getRows();
for(TRow resultRow : resultRows){
    resultRow.toString();
}

TCloseOperationReq closeReq = new TCloseOperationReq();
closeReq.setOperationHandle(stmtHandle);
client.CloseOperation(closeReq);
TCloseSessionReq closeConnectionReq = new TCloseSessionReq(sessHandle);
client.CloseSession(closeConnectionReq);

transport.close();

我针对使用创建的 Hiverserver2 实例运行此代码

I run this code against a Hiverserver2 instance created with

export HIVE_SERVER2_THRIFT_PORT=10002;hive --service hiveserver2

调试时,我从来没有越线

When debugging, I never get past the line

TOpenSessionResp openResp = client.OpenSession(openReq);

客户端只是挂起,直到达到超时并且服务器不会向标准输出或日志写入任何内容.使用 Wireshark,我可以看到 OpenSession() 的 TCP 段被发送和确认.一旦我杀死客户端或达到超时,服务器就会给我以下信息:

The client simply hangs until the timeout is reached and the server doesn't write anything to stdout or the logs. Using Wireshark, I can see the TCP segment for OpenSession() is sent and ACK'd. Once I kill the client or the timeout is reached, the server gives me the following:

13/03/14 11:15:33 ERROR server.TThreadPoolServer: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset
    at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
    at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset
    at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
    at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
    at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182)
    at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
    at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
    at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
    at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
    ... 4 more
Caused by: java.net.SocketException: Connection reset
    at java.net.SocketInputStream.read(SocketInputStream.java:168)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
    at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
    ... 10 more

我发现有趣的是,这与我错误地尝试对 hiveserver2 使用 hiveserver (1) 客户端时收到的错误完全相同,这表明就 hiverserver2 而言,我的客户端正在向它发送垃圾.

I find it interesting that this is the exact same error I was receiving when I was mistakenly attempting to use a hiveserver (1) client against hiveserver2, which suggests that as far as hiverserver2 is concerned, my client is sending it garbage.

对于我可能出错的地方,我看到了三种可能性.

I see three possibilities for where I might be going wrong.

1) 我对客户端 API 的使用是错误的.我看到在 JDBC 实现中有一些关于身份验证和连接参数的事情,我没有在我的示例代码中使用.我玩弄那个,但我在黑暗中拍摄,没有进一步.

1) My use of the client API is wrong. I saw that in the JDBC implementation there was some stuff going on with authentication and connection parameters which I'm not using in my example code. I played around with that, but I was shooting in the dark and didn't get any further.

2) 我有一些设置步骤错误.我无法在 hive-servive-0.10.0 jar 中找到 TCLIService,但我能够在 Hortonworks 在 HDP 1.2 中发布的 hive-servive-0.10.0.21 jar 中找到它,所以也许可以挖掘一下揭示问题.或者也许我需要配置服务器端,这解释了为什么我可以使用 ODBC 连接到 hive,但不能使用我的 thrift 客户端.

2) I got some setup step wrong. I wasn't able to find TCLIService in the hive-servive-0.10.0 jar, but I was able to find it in the hive-servive-0.10.0.21 jar released by Hortonworks in HDP 1.2, so maybe digging around with that will reveal the issue. Or maybe there is something I need configure server side which explains why I can connect to hive using the ODBC but not with my thrift client.

3) 可能此时无法针对 hiveserver2 客户端 api 进行写入.基于缺乏文档和互联网上明显缺乏成功示例,这是合理的,但 JDBC 似乎做到了.我发现这是最不可能的选择.

3) It could be that at this point it is impossible to write against the hiveserver2 client api. This is plausible based on the lack of documentation and the apparent lack of successful examples on the internet, but the JDBC seems to do it. I find this the most unlikely option.

即使您不知道修复,知道修复是否属于 1、2 或 3 将有助于缩小我的搜索范围.

Even if you don't know a fix, knowing if the fix falls under 1, 2, or 3 would help narrow my search.

推荐答案

不确定您是否仍然遇到此问题,但由于我遇到了同样的问题并解决了它(也许绕过是更准确的描述),我将在此处发布解决方案,以防其他人需要它.

Not sure if you're still experiencing this issue, but since i've confronted the same problem and resolved it (perhaps bypassed is more accurate description), i'll post a solution here just in case someone else needs it.

这是因为当您打开传输连接时,thrift 服务器期望通过 SASL 进行身份验证.Hive Server 2 默认使用 SASL - 不幸的是,PHP 缺少 TSaslClientTransport 版本(用作另一个 TTransport 对象的包装器),用于在您打开传输连接时处理 SASL 协商.

This is because the thrift server is expecting to authenticate via SASL when you open your transport connection. Hive Server 2 defaults to using SASL - unfortunately, PHP lacks a version of TSaslClientTransport (which is used as a wrapper around another TTransport object) which handles the SASL negotiation when you open your transport connection.

目前最简单的解决方案是在 hive-site.xml 中设置以下属性

The easiest solution for now is to set the following property in your hive-site.xml

<property><name>hive.server2.authentication</name><value>NOSASL</value></property>

这篇关于使用 Hiveserver2 Thrift Java 客户端时请求挂起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆