SparkSQL + Hive + Hbase + HbaseIntegration不起作用 [英] SparkSQL+Hive+Hbase+HbaseIntegration doesn't work

查看:145
本文介绍了SparkSQL + Hive + Hbase + HbaseIntegration不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我尝试连接hive表(这是通过HbaseIntegration创建的
)时,出现错误。



我遵循的步骤:
Hive Table创建代码

  CREATE TABLE test.sample (id string,name string)
STORED BY'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITH
SERDEPROPERTIES(hbase.columns.mapping=:key,details:name)
TBLPROPERTIES(hbase.table.name=sample);

DESCRIBE TEST;

  col_name data_type注释
来自反序列化器的id字符串
来自反序列化器的名称字符串

使用此命令启动Spark shell:

  spark -shell --master local [2] --driver-class -path / usr / local / hive / lib / hive- 
hbase-handler-1.2.1.jar:
/ usr / local / hbase / lib / hbase-server-0.98.9-
hadoop2.jar:/usr/local/hbase/lib/hbase-protocol-0.98.9-hadoo2.jar:
/ usr / local / hbase / lib / hbase-hadoop2-compat-0.98.9-
hadoop2.jar:/usr/local/hbase/lib/hbase-hadoop-compat-0.98.9-hadoop2.jar:
/ usr / local / hbase / lib / hbase-client-0.98.9-
hadoop2.jar:/usr/local/hbase/lib/hbase-common-0.98.9-hadoop2.jar:
/ usr / local / hbase / lib / htrace-core-2.04.jar:/ usr / local / hbase / lib / hbase-common-
0.98.9-hadoop2-tests.jar:
/ usr / local / hbase / lib / hbase-server-0.98.9-hadoop2-
tests.jar:/usr/local/hive/lib/zookeeper-3.4 .jar:/ usr / local / hive / lib / guava-
14.0.1.jar

在spark-shell中:

  val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc )

sqlContext.sql(从test.sample中选择count(*))。collect()

Stack Trace

以sqlContext的形式提供堆栈SQL上下文。

  scala> sqlContext.sql(select count(*)from test.sample)。collect()

16/09/02 04:49:28 INFO parse.ParseDriver:解析命令:select count(* )from test.sample
16/09/02 04:49:35 INFO parse.ParseDriver:Parse Completed
16/09/02 04:49:40 INFO metastore.HiveMetaStore:0:get_table:db = test tbl = sample
16/09/02 04:49:40 INFO HiveMetaStore.audit:ugi = hdfs ip = unknown-ip-addr cmd = get_table:db = test tbl = sample
java。 lang.NoClassDefFoundError:org / apache / hadoop / hbase / util / Bytes
at org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
at org.apache.hadoop .hive.hbase.HBaseSerDeParameters。< init>(HBaseSerDeParameters.java:73)
at org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
at org。 apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
at org。 apache.hadoop.hive .metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:391)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:276)
at org.apache.hadoop .hive.ql.metadata.Table.getDeserializer(Table.java:258)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:605)
at org .apache.spark.sql.hive.client.ClientWrapper $$ anonfun $ getTableOption $ 1 $$ anonfun $ 3.apply(ClientWrapper.scala:331)
at org.apache.spark.sql.hive.client.ClientWrapper $ $ anonfun $ getTableOption $ 1 $$ anonfun $ 3.apply(ClientWrapper.scala:326)
at scala.Option.map(Option.scala:145)
at org.apache.spark.sql.hive。 client.ClientWrapper $$ anonfun $ getTableOption $ 1.apply(ClientWrapper.scala:326)
at org.apache.spark.sql.hive.client.ClientWrapper $$ anonfun $ getTableOption $ 1.apply(ClientWrapper.scala:321 )
在org.apache.spark.sql.hive.client.ClientWrapper $$ anonfun $ withHiveState $ 1.apply(ClientWrapper.scala:279)
在org.apache.sp ark.sql.hive.client.ClientWrapper.liftedTree1 $ 1(ClientWrapper.scala:226)
at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:225)
在org.apache.spark.sql.hive.client.ClientWrapper.getTableOption(ClientWrapper.scala:321)上的
(位于org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:268)
at org.apache.spark.sql.hive.client.ClientInterface $ class.getTable(ClientInterface.scala:122)
at org.apache.spark.sql.hive.client.ClientWrapper.getTable( ClientWrapper.scala:60)
at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:384)
at org.apache.spark.sql.hive.HiveContext $$ anon $ 2.org $ apache $ spark $ sql $ catalyst $ analysis $ OverrideCatalog $$ super $ lookupRelation(HiveContext.scala:457)
at org.apache.spark.sql.catalyst.analysis.OverrideCatalog $ class.lookupRelation( Catalog.scala:161)
在org.apache.spark.sql.hive.HiveContext $$ anon $ 2.lookupRela (HiveContext.scala:457)
at org.apache.spark.sql.catalyst.analysis.Analyzer $ ResolveRelations $ .getTable(Analyzer.scala:303)

我使用的是hadoop 2.6.0,spark 1.6.0,hive 1.2.1,hbase 0.98.9

我在hadoop-env.sh中添加了这个设置:
$ b $ $ export $ HADOOP_CLASSPATH = $ HADOOP_CLASSPATH $ HBASE_HOME / lib / *

可以请一些机构提出任何解决方案

解决方案

  java.lang.NoClassDefFoundError:org / apache / hadoop / hbase / util / Bytes 

因为hbase相关的jar不存在于classpath中

  export HADOOP_CLASSPATH = $ HADOOP_CLASSPATH:`hbase classpath` 

应包含所有与hbase相关的jar文件
或使用 - jars 答案 >

注意:验证类路径,您可以在驱动程序的以下代码中添加以打印所有类路径资源。 / p>

  val cl = ClassLoader.getSystemClassLoader 
cl.asInstanceOf [java.net.URLClassLoader] .getURLs.foreach(println)

java:

  import java.net.URL; 

import java.net.URLClassLoader;
...

ClassLoader cl = ClassLoader.getSystemClassLoader();

URL [] urls =((URLClassLoader)cl).getURLs(); (URL url:url){
System.out.println(url.getFile());


}


I am getting error when I am trying to connect hive table (which is being created through HbaseIntegration) in spark

Steps I followed : Hive Table creation code :

CREATE TABLE test.sample(id string,name string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH     
SERDEPROPERTIES ("hbase.columns.mapping" = ":key,details:name")
TBLPROPERTIES ("hbase.table.name" = "sample");

DESCRIBE TEST ;

 col_name data_type comment
 id string from deserializer
 name string from deserializer

Starting Spark shell with this command:

spark-shell --master local[2] --driver-class-path /usr/local/hive/lib/hive-   
hbase-handler-1.2.1.jar:
/usr/local/hbase/lib/hbase-server-0.98.9-  
hadoop2.jar:/usr/local/hbase/lib/hbase-protocol-0.98.9-hadoo2.jar:
/usr/local/hbase/lib/hbase-hadoop2-compat-0.98.9-  
hadoop2.jar:/usr/local/hbase/lib/hbase-hadoop-compat-0.98.9-hadoop2.jar:
/usr/local/hbase/lib/hbase-client-0.98.9-   
hadoop2.jar:/usr/local/hbase/lib/hbase-common-0.98.9-hadoop2.jar:
/usr/local/hbase/lib/htrace-core-2.04.jar:/usr/local/hbase/lib/hbase-common-  
0.98.9-hadoop2-tests.jar:
/usr/local/hbase/lib/hbase-server-0.98.9-hadoop2-  
tests.jar:/usr/local/hive/lib/zookeeper-3.4.6.jar:/usr/local/hive/lib/guava-  
14.0.1.jar

In spark-shell:

val sqlContext=new org.apache.spark.sql.hive.HiveContext(sc)

sqlContext.sql("select count(*) from test.sample").collect()

Stack Trace :

Stack SQL context available as sqlContext.

scala> sqlContext.sql("select count(*) from test.sample").collect()

16/09/02 04:49:28 INFO parse.ParseDriver: Parsing command: select count(*) from test.sample
16/09/02 04:49:35 INFO parse.ParseDriver: Parse Completed
16/09/02 04:49:40 INFO metastore.HiveMetaStore: 0: get_table : db=test tbl=sample
16/09/02 04:49:40 INFO HiveMetaStore.audit: ugi=hdfs    ip=unknown-ip-addr  cmd=get_table : db=test tbl=sample  
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/util/Bytes
    at org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
    at org.apache.hadoop.hive.hbase.HBaseSerDeParameters.<init>(HBaseSerDeParameters.java:73)
    at org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
    at org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
    at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:391)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:276)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:258)
    at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:605)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1$$anonfun$3.apply(ClientWrapper.scala:331)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1$$anonfun$3.apply(ClientWrapper.scala:326)
    at scala.Option.map(Option.scala:145)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1.apply(ClientWrapper.scala:326)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1.apply(ClientWrapper.scala:321)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.apply(ClientWrapper.scala:279)
    at org.apache.spark.sql.hive.client.ClientWrapper.liftedTree1$1(ClientWrapper.scala:226)
    at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:225)
    at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:268)
    at org.apache.spark.sql.hive.client.ClientWrapper.getTableOption(ClientWrapper.scala:321)
    at org.apache.spark.sql.hive.client.ClientInterface$class.getTable(ClientInterface.scala:122)
    at org.apache.spark.sql.hive.client.ClientWrapper.getTable(ClientWrapper.scala:60)
    at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:384)
    at org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:457)
    at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:161)
    at org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(HiveContext.scala:457)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:303)

I am using hadoop 2.6.0, spark 1.6.0, hive 1.2.1, hbase 0.98.9

I added this setting in hadoop-env.sh as

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/lib/*

Can some body please suggest any solution

解决方案

java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/util/Bytes

because hbase related jars are not there in classpath

 export HADOOP_CLASSPATH=$HADOOP_CLASSPATH: `hbase classpath`  

should include all hbase related jar files or else see my answer here using --jars

Note : To verify the classpath you can add below code in the driver to print all the classpath resources

scala version :

val cl = ClassLoader.getSystemClassLoader
    cl.asInstanceOf[java.net.URLClassLoader].getURLs.foreach(println)

java :

import java.net.URL;

import java.net.URLClassLoader;
...

   ClassLoader cl = ClassLoader.getSystemClassLoader();

        URL[] urls = ((URLClassLoader)cl).getURLs();

        for(URL url: urls) {
            System.out.println(url.getFile());
        }

这篇关于SparkSQL + Hive + Hbase + HbaseIntegration不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆