Spark-HBase-GCP模板(2/3)-json4s的版本问题? [英] Spark-HBase - GCP template (2/3) - Version issue of json4s?

查看:95
本文介绍了Spark-HBase-GCP模板(2/3)-json4s的版本问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在GCP上下文中测试Spark-HBase连接器,并尝试遵循 1 ,该请求会在本地使用Maven(我尝试过Maven 3.6.3)为Spark 2.4打包连接器[2],并在完成[3]后在 Dataproc 上提交作业时出现以下错误.

I'm trying to test the Spark-HBase connector in the GCP context and tried to follow 1, which asks to locally package the connector [2] using Maven (I tried Maven 3.6.3) for Spark 2.4, and get following error when submitting the job on Dataproc (after having completed [3]).

有什么主意吗?

感谢您的支持

参考

1 [2] https://github.com/hortonworks-spark/shc/tree/branch-2.4

[3] Spark-HBase-GCP模板(1/3)-如何在本地打包Hortonworks连接器?

命令

(基本)gcloud dataproc作业提交spark --cluster $ SPARK_CLUSTER --class com.example.bigtable.spark.shc.BigtableSource --jars target/scala-2.11/cloud-bigtable-dataproc-spark-shc-assembly-0.1.jar --region us-east1-$ BIGTABLE_TABLE

错误

职位[d3b9107ae5e2462fa71689cb0f5909bd]已提交.等待作业输出... 20/12/27 12:50:10 INFO org.spark_project.jetty.util.log:记录初始化@ 2475ms 20/12/27 12:50:10 INFO org.spark_project.jetty.server.服务器:jetty-9.3.z-SNAPSHOT,构建时间戳:未知,git哈希:未知20/12/27 12:50:10信息org.spark_project.jetty.server.服务器:已启动@ 2576ms 20/12/27 12:50:10 INFO org.spark_project.jetty.server.AbstractConnector:已启动ServerConnector @ 3e6cb045 {HTTP/1.1,[http/1.1]} {0.0.0.0:4040} 20/12/27 12:50:10 WARN org.apache.spark.scheduler.FairSchedulableBuilder:找不到公平的调度程序配置文件,因此将按FIFO顺序调度作业.要使用公平调度,请在fairscheduler.xml中配置池,或将spark.scheduler.allocation.file设置为包含配置的文件.20/12/27 12:50:11 INFO org.apache.hadoop.yarn.client.RMProxy:通过spark-cluster-m/10.142.0.10:8032连接到ResourceManager 20/12/27 12:50:11 INFO org.apache.hadoop.yarn.client.AHSProxy:通过spark-cluster-m/10.142.0.10:10200 20/12/27 12:50:13连接到应用程序历史记录"服务器INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl:提交的应用程序application_1609071162129_0002线程"main"中的异常java.lang.NoSuchMethodError:org.org.apache.spark.sql.execution.datasources.hbase.HBaseTableCatalog $ .apply(HBaseTableCatalog.scala:262)上的org.json4s.jackson.JsonMethods $ .parse $ default $ 3()Z.apache.spark.sql.execution.datasources.hbase.HBaseRelation.(HBaseRelation.scala:84)位于org.apache.spark.sql.execution.datasources.hbase.DefaultSource.createRelation(HBaseRelation.scala:61))的org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult $ lzycompute(commands.scala:70)的org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)org.apache上的org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)上的org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)位于org.apache.spark.sql.execution.SparkPlan $$ anonfun $ execute $ 1.apply(SparkPlan.scala:127)的spark.sql.execution.SparkPlan $$ anonfun $ execute $ 1.apply(SparkPlan.scala:131)org.apache.sporg.apache.spark.rdd.RDDOperationScope $ .withScope(RDDOperationScope.scala:151)的ark.sql.execution.SparkPlan $$ anonfun $ executeQuery $ 1.apply(SparkPlan.scala:155)org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)处的org.apache.spark.sql.execution.QueryExecution.toRdd $ lzycompute的.execution.SparkPlan.executeQuery(SparkPlan.scala:152)(QueryExecution.scala:80)在org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)在org.apache.spark.sql.DataFrameWriter $$ anonfun $ runCommand $ 1.apply(DataFrameWriter.scala:656),位于org.apache.spark.sql.DataFrameWriter $$ anonfun $ runCommand $ 1.apply(DataFrameWriter.scala:656),位于org.apache.spark.sql.execution.SQLExecution $ .withNewExecutionId(SQLExecution.scala:77)在org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:656)在org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273)在org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267)在com.example.bigtable.spark.shc.BigtableSource $ .delayedEndpoint $ com $ example $ bigtable $ spark $ shc $ BigtableSource $ 1(BigtableSource.scala:56)位于com.example.bigtable.spark.shc.BigtableSource $ delayedInit $ body在scala上的.apply(BigtableSource.scala:19)在scala.Function0 $ class.apply $ mcV $ sp(Function0.scala:34)在scala.runtime.AbstractFunction0.apply $ mcV $ sp(AbstractFunction0.scala:12)在scala.在scala上的App $ anonfun $ main $ 1.apply(App.scala:76)在scala.App $$ anonfun $ main $ 1.apply(App.scala:76)在scala.collection.immutable.List.foreach(List.scala:381),位于scala.collection.generic.TraversableForwarder $ class.foreach(TraversableForwarder.scala:35),位于scala.App $ class.main(App.scala:76),位于com.example.bigtable.spark.shc.BigtableSource $.main(BigtableSource.scala:19)位于com.example.bigtable.spark.shc.BigtableSource.main(BigtableSource.scala)位于sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62),位于sun.reflect.DelegatingMethodAccessorImpl.in在org.apache.spark上的java.lang.reflect.Method.invoke(Method.java:498)上的voke(DelegatingMethodAccessorImpl.java:43)在org.apache.spark上的org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)上的voke(DelegatingMethodAccessorImpl.java:43)org.apache上的.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:890)在org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:192)在org.apacheorg.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:137)上的.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:217)org.apache.spark.deploy.SparkSubmit.main(SparkSubmit).scala)20/12/27 12:50:20 INFO org.spark_project.jetty.server.AbstractConnector:已停止Spark @ 3e6cb045 {HTTP/1.1,[http/1.1]} {0.0.0.0:4040}

推荐答案

考虑阅读以下相关的SO问题: 2 .

Consider reading these related SO questions: 1 and 2.

在您了解了本教程的内容以及所指示的问题之一之后,请使用 Apache Spark-HortonWorks提供的Apache HBase连接器.

Under the hood the tutorial you followed, as well of one of the question indicated, use the Apache Spark - Apache HBase Connector provided by HortonWorks.

问题似乎与 json4s 库的版本不兼容有关:在两种情况下,似乎都使用版本 3.2.10 3.2.11 在构建过程中将解决此问题.

The problem seems to be related with an incompatibility with the version of the json4s library: in both cases, it seems that using version 3.2.10 or 3.2.11 in the build process will solve the issue.

pom.xml(shc-core)中添加以下依赖项:

<dependency>
  <groupId>org.json4s</groupId>
  <artifactId>json4s-jackson_2.11</artifactId>
  <version>3.2.11</version>
</dependency>

这篇关于Spark-HBase-GCP模板(2/3)-json4s的版本问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆