Spark-HBase - GCP 模板 (3/3) - 缺少库? [英] Spark-HBase - GCP template (3/3) - Missing libraries?

查看:19
本文介绍了Spark-HBase - GCP 模板 (3/3) - 缺少库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 GCP 上下文中测试 Spark-HBase 连接器并尝试遵循 指令,要求本地打包 连接器,并且在 Dataproc 上提交作业时出现以下错误(在完成 这些步骤).

I'm trying to test the Spark-HBase connector in the GCP context and tried to follow the instructions, which asks to locally package the connector, and I get the following error when submitting the job on Dataproc (after having completed these steps).

命令

(base) gcloud dataproc jobs submit spark --cluster $SPARK_CLUSTER --class com.example.bigtable.spark.shc.BigtableSource --jars target/scala-2.11/cloud-bigtable-dataproc-spark-shc-assembly-0.1.jar --region us-east1 -- $BIGTABLE_TABLE

错误

线程main"中的异常java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration

推荐答案

我找到了一种可行的方法,通过在 build.sbt 中添加以下依赖项 - 感谢 @jccampanero 寻求指导!

I found a working way, by adding following dependencies in build.sbt - thanks @jccampanero for the guidance !

libraryDependencies += "org.apache.hbase" % "hbase-common" % "2.0.2"
libraryDependencies += "org.apache.hbase" % "hbase-mapreduce" % "2.0.2"

输出 (Bigtablesource.scala)

+------+-----+----+----+
|  col0| col1|col2|col3|
+------+-----+----+----+
|row000| true| 0.0|   0|
|row001|false| 1.0|   1|
|row002| true| 2.0|   2|
|row003|false| 3.0|   3|
|row004| true| 4.0|   4|
|row005|false| 5.0|   5|
|row006| true| 6.0|   6|
|row007|false| 7.0|   7|
|row008| true| 8.0|   8|
|row009|false| 9.0|   9|
|row010| true|10.0|  10|
|row011|false|11.0|  11|
|row012| true|12.0|  12|
|row013|false|13.0|  13|
|row014| true|14.0|  14|
|row015|false|15.0|  15|
|row016| true|16.0|  16|
|row017|false|17.0|  17|
|row018| true|18.0|  18|
|row019|false|19.0|  19|
+------+-----+----+----+
only showing top 20 rows

+------+-----+
|  col0| col1|
+------+-----+
|row000| true|
|row001|false|
|row002| true|
|row003|false|
|row004| true|
|row005|false|
+------+-----+

+------+-----+
|  col0| col1|
+------+-----+
|row000| true|
|row001|false|
|row002| true|
|row003|false|
|row004| true|
|row005|false|
+------+-----+

+------+-----+
|  col0| col1|
+------+-----+
|row251|false|
|row252| true|
|row253|false|
|row254| true|
|row255|false|
+------+-----+

+-----------+
|count(col1)|
+-----------+
|         50|
+-----------+

这篇关于Spark-HBase - GCP 模板 (3/3) - 缺少库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆