Scala和SparkSQL:ClassNotPersistableException [英] Scala and SparkSQL: ClassNotPersistableException

查看:150
本文介绍了Scala和SparkSQL:ClassNotPersistableException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建两个数据框并使用dataframe.join方法将其加入.

I am trying to create two dataframe and join it using dataframe.join method.

这是scala代码:

import org.apache.spark.sql.SparkSession
import org.apache.spark.SparkConf

object RuleExecutor {
  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setAppName(AppConstants.AppName).setMaster("local")
    val sparkSession = SparkSession.builder().appName(AppConstants.AppName).config(sparkConf).enableHiveSupport().getOrCreate()
    import sparkSession.sql

    sql(s"CREATE DATABASE test")

    sql ("CREATE TABLE test.box_width (id INT, width INT)")   // Create table box_width
    sql ("INSERT INTO test.box_width VALUES (1,1), (2,2)")    // Insert data in box_width

    sql ("CREATE TABLE test.box_length (id INT, length INT)") // Create table box_length
    sql ("INSERT INTO test.box_length VALUES (1,10), (2,20)") // Insert data in box_length

    val widthDF = sql("select *  from  test.box_width")       // Get DF for table box_width
    val lengthDF = sql("select *  from  test.box_length")     // Get DF for table box_length

    val dimensionDF = lengthDF.join(widthDF, "id");           // Joining
    dimensionDF.show();
  }
}

但是在运行代码时,出现以下错误:

But when running code, I am getting following error:

Exception in thread "main" java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
    at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1062)…..
Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)……
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)……
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)……
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)…
Caused by: org.datanucleus.api.jdo.exceptions.ClassNotPersistenceCapableException: The class "org.apache.hadoop.hive.metastore.model.MVersionTable" is not persistable. This means that it either hasnt been enhanced, or that the enhanced version of the file is not in the CLASSPATH (or is hidden by an unenhanced version), or the Meta-Data/annotations for the class are not found.
NestedThrowables:
org.datanucleus.exceptions.ClassNotPersistableException: The class "org.apache.hadoop.hive.metastore.model.MVersionTable" is not persistable. This means that it either hasnt been enhanced, or that the enhanced version of the file is not in the CLASSPATH (or is hidden by an unenhanced version), or the Meta-Data/annotations for the class are not found.
    at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:473)……
Caused by: org.datanucleus.exceptions.ClassNotPersistableException: The class "org.apache.hadoop.hive.metastore.model.MVersionTable" is not persistable. This means that it either hasnt been enhanced, or that the enhanced version of the file is not in the CLASSPATH (or is hidden by an unenhanced version), or the Meta-Data/annotations for the class are not found.
    at org.datanucleus.ExecutionContextImpl.assertClassPersistable(ExecutionContextImpl.java:5113)……

我使用的版本是

scala = 2.11
Spark-Hive = 2.2.2
Maven-org-spark-project-hive_hive-metastore = 1.x
DataNucleus = 5.x

Scala = 2.11
Spark-hive = 2.2.2
Maven-org-spark-project-hive_hive-metastore = 1.x
DataNucleus = 5.x

如何解决此问题? 完整日志 依赖项列表

How to resolve this issue? complete log list of dependencies

谢谢

推荐答案

首先,除非在编写Scala时一行中有多个表达式,否则您不再需要在行尾使用;代码.

First of all you don't need to use ; at the end of lines anymore, unless you have more than one expression in one line while writing Scala code.

第二,我检查了您的日志,发现了15个错误,主要是数据库表不存在或找不到配置单元.因此,我认为这些实例无法正常运行.在运行Spark作业之前,您可以确保正确设置所有这些内容(Hive,MySql DB)吗?

Second, I went through your log, and there are 15 errors, mainly either the database table is not there or can't find hive. So I think these instances are not running correctly. Could you make sure you have all those things (Hive, MySql DB) setup correctly, before running the spark job?

这篇关于Scala和SparkSQL:ClassNotPersistableException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆