通过 Java JDBC 连接 Hive [英] Connect Hive through Java JDBC

查看:27
本文介绍了通过 Java JDBC 连接 Hive的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这里有一个问题从java连接到Hive但是我的不同

我的 hive 在 machine1 上运行,我需要使用在 machine2 上运行的 Java 服务器传递一些查询.据我了解,Hive 有一个 JDBC 接口,用于接收远程查询.我从这里获取了代码 - HiveServer2 Clients

我安装了文章中写的依赖:

  1. hive-jdbc*.jar
  2. 蜂巢服务*.jar
  3. libfb303-0.9.0.jar
  4. libthrift-0.9.0.jar
  5. log4j-1.2.16.jar
  6. slf4j-api-1.6.1.jar
  7. slf4j-log4j12-1.6.1.jar
  8. commons-logging-1.0.4.jar

但是我在编译时遇到了 java.lang.NoClassDefFoundError 错误完全错误:

线程main"中的异常 java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration在 org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:393)在 org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187)在 org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:163)在 org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)在 java.sql.DriverManager.getConnection(DriverManager.java:571)在 java.sql.DriverManager.getConnection(DriverManager.java:215)在 com.bidstalk.tools.RawLogsQuerySystem.HiveJdbcClient.main(HiveJdbcClient.java:25)

StackOverflow 的另一个问题建议在 Maven 中添加 Hadoop API 依赖项 - Hive 错误

我不明白为什么客户端需要使用 hadoop API 才能与 Hive 连接.JDBC 驱动程序不应该与底层查询系统无关吗?我只需要传递一些 SQL 查询?

我正在使用 Cloudera(5.3.1),我想我需要添加 CDH 依赖项.Cloudera 实例正在运行 hadoop 2.5.0 和 HiveServer2

但是服务器在机器 1 上.在机器上代码至少应该编译,我应该只在运行时出现问题!

解决方案

回答我自己的问题!

通过一些尝试和尝试,我在我的 pom 文件中添加了以下依赖项,从那时起我就可以在 CHD 5.3.1 和 5.2.1 集群上运行代码.

<依赖><groupId>org.apache.hive</groupId><artifactId>hive-jdbc</artifactId><version>0.13.1-cdh5.3.1</version></依赖><依赖><groupId>org.apache.thrift</groupId><artifactId>libthrift</artifactId><version>0.9.0</version></依赖><依赖><groupId>org.apache.thrift</groupId><artifactId>libfb303</artifactId><version>0.9.0</version></依赖><依赖><groupId>org.apache.hadoop</groupId><artifactId>hadoop-core</artifactId><version>2.5.0-mr1-cdh5.3.1</version></依赖><依赖><groupId>org.apache.hadoop</groupId><artifactId>hadoop-common</artifactId><version>2.5.0-cdh5.3.1</version></依赖><依赖><groupId>org.apache.hive</groupId><artifactId>hive-exec</artifactId><version>0.13.1-cdh5.3.1</version></依赖><依赖><groupId>org.apache.hadoop</groupId><artifactId>hadoop-hdfs</artifactId><version>2.5.0-cdh5.3.1</version></依赖><依赖>

请注意,其中一些依赖项可能不是必需的

There is a question here connect from java to Hive but mine is different

My hive running on machine1 and I need to pass some queries using Java server running at machine2. As I understand Hive has a JDBC interface for the purpose of receiving remote queries. I took the code from here - HiveServer2 Clients

I installed the dependencies written in the article:

  1. hive-jdbc*.jar
  2. hive-service*.jar
  3. libfb303-0.9.0.jar
  4. libthrift-0.9.0.jar
  5. log4j-1.2.16.jar
  6. slf4j-api-1.6.1.jar
  7. slf4j-log4j12-1.6.1.jar
  8. commons-logging-1.0.4.jar

However I got java.lang.NoClassDefFoundError error at compile time Full Error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
    at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:393)
    at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187)
    at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163)
    at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
    at java.sql.DriverManager.getConnection(DriverManager.java:571)
    at java.sql.DriverManager.getConnection(DriverManager.java:215)
    at com.bidstalk.tools.RawLogsQuerySystem.HiveJdbcClient.main(HiveJdbcClient.java:25)

Another question at StackOverflow recommended to add Hadoop API dependencies in Maven - Hive Error

I don't understand why do I need hadoop API for a client to connect with Hive. Shouldn't JDBC driver be agnostic of the underlying query system? I just need to pass some SQL query?

Edit: I am using Cloudera(5.3.1), I think I need to add CDH dependencies. Cloudera instance is running hadoop 2.5.0 and HiveServer2

But the servers are at machine 1. On machine the code should at least compile and I should have issues at runtime only!

解决方案

Answering my own question!

With some hit and trial, I have added following dependencies on my pom file and since then I am able to run code on both CHD 5.3.1 and 5.2.1 cluster.

<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>0.13.1-cdh5.3.1</version>
</dependency>
<dependency>
    <groupId>org.apache.thrift</groupId>
    <artifactId>libthrift</artifactId>
    <version>0.9.0</version>
</dependency>
<dependency>
    <groupId>org.apache.thrift</groupId>
    <artifactId>libfb303</artifactId>
    <version>0.9.0</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>2.5.0-mr1-cdh5.3.1</version>
</dependency>

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.5.0-cdh5.3.1</version>
</dependency>
<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-exec</artifactId>
    <version>0.13.1-cdh5.3.1</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>2.5.0-cdh5.3.1</version>
</dependency>
<dependency>

Please note that some of these dependencies might not be required

这篇关于通过 Java JDBC 连接 Hive的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆