通过Java JDBC连接Hive [英] Connect Hive through Java JDBC
问题描述
这里有一个问题从Java连接到Hive ,但是我的不同
我的配置单元在machine1上运行,我需要使用在machine2上运行的Java服务器传递一些查询.据我了解,Hive具有JDBC接口,用于接收远程查询.我从这里获取了代码- HiveServer2客户端
我安装了本文中写的依赖项:
- hive-jdbc * .jar
- hive-service * .jar
- libfb303-0.9.0.jar
- libthrift-0.9.0.jar
- log4j-1.2.16.jar
- slf4j-api-1.6.1.jar
- slf4j-log4j12-1.6.1.jar
- commons-logging-1.0.4.jar
但是在编译时出现 java.lang.NoClassDefFoundError 错误 完全错误:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:393)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at com.bidstalk.tools.RawLogsQuerySystem.HiveJdbcClient.main(HiveJdbcClient.java:25)
StackOverflow上的另一个问题建议在Maven中添加Hadoop API依赖项-配置单元错误 >
我不明白为什么我需要hadoop API来使客户端与Hive连接. JDBC驱动程序不应该与底层查询系统无关吗?我只需要传递一些SQL查询?
我正在使用Cloudera(5.3.1),我想我需要添加CDH依赖项. Cloudera实例正在运行hadoop 2.5.0和HiveServer2
但是服务器在机器1上.在机器上,代码至少应该编译,并且我应该仅在运行时遇到问题!
回答我自己的问题!
通过一些尝试和试用,我对pom文件添加了以下依赖项,从那时起,我就能够在CHD 5.3.1和5.2.1集群上运行代码.
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>0.13.1-cdh5.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.thrift</groupId>
<artifactId>libthrift</artifactId>
<version>0.9.0</version>
</dependency>
<dependency>
<groupId>org.apache.thrift</groupId>
<artifactId>libfb303</artifactId>
<version>0.9.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>2.5.0-mr1-cdh5.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.5.0-cdh5.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>0.13.1-cdh5.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.5.0-cdh5.3.1</version>
</dependency>
<dependency>
请注意,其中某些依赖项可能不是必需的
There is a question here connect from java to Hive but mine is different
My hive running on machine1 and I need to pass some queries using Java server running at machine2. As I understand Hive has a JDBC interface for the purpose of receiving remote queries. I took the code from here - HiveServer2 Clients
I installed the dependencies written in the article:
- hive-jdbc*.jar
- hive-service*.jar
- libfb303-0.9.0.jar
- libthrift-0.9.0.jar
- log4j-1.2.16.jar
- slf4j-api-1.6.1.jar
- slf4j-log4j12-1.6.1.jar
- commons-logging-1.0.4.jar
However I got java.lang.NoClassDefFoundError error at compile time Full Error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:393)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at com.bidstalk.tools.RawLogsQuerySystem.HiveJdbcClient.main(HiveJdbcClient.java:25)
Another question at StackOverflow recommended to add Hadoop API dependencies in Maven - Hive Error
I don't understand why do I need hadoop API for a client to connect with Hive. Shouldn't JDBC driver be agnostic of the underlying query system? I just need to pass some SQL query?
Edit: I am using Cloudera(5.3.1), I think I need to add CDH dependencies. Cloudera instance is running hadoop 2.5.0 and HiveServer2
But the servers are at machine 1. On machine the code should at least compile and I should have issues at runtime only!
Answering my own question!
With some hit and trial, I have added following dependencies on my pom file and since then I am able to run code on both CHD 5.3.1 and 5.2.1 cluster.
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>0.13.1-cdh5.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.thrift</groupId>
<artifactId>libthrift</artifactId>
<version>0.9.0</version>
</dependency>
<dependency>
<groupId>org.apache.thrift</groupId>
<artifactId>libfb303</artifactId>
<version>0.9.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>2.5.0-mr1-cdh5.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.5.0-cdh5.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>0.13.1-cdh5.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.5.0-cdh5.3.1</version>
</dependency>
<dependency>
Please note that some of these dependencies might not be required
这篇关于通过Java JDBC连接Hive的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!