NoSuchFieldException:parentOffset-Hive在Spark上 [英] NoSuchFieldException: parentOffset - Hive on Spark

查看:44
本文介绍了NoSuchFieldException:parentOffset-Hive在Spark上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在本地Spark上运行Hive.我已经按照配置单元上的所有配置进行操作官方网站.在hive控制台上,我首先创建了一个简单表,并尝试在其中插入一些值.

I'm trying to run Hive on Spark locally. I have followed all the configurations on the hive official site. On the hive console, I firstly created a simple table and tried to insert a few values into it.

set hive.cli.print.current.db=true;    
create temporary table sketch_input (id int, category char(1));
insert into table sketch_input values (1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (5, 'a'), (6, 'a'), (7, 'a'), (8, 'a'), (9, 'a'), (10, 'a'), (6, 'b'), (7, 'b'), (8, 'b'), (9, 'b'), (10, 'b'), (11, 'b'), (12, 'b'), (13, 'b'), (14, 'b'), (15, 'b');

但是在提交作业并启动Spark执行程序之后,我得到了 NoSuchFieldException:parentOffset

But after the job is submitted and the spark executor is started I get a NoSuchFieldException: parentOffset

这是详细的日志:

...
2020-03-23T10:00:52,387  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] ql.Driver: Executing command(queryId=rudip7_20200323100050_ce101e58-7b74-4335-8086-420c7999fe86): insert into table sketch_input values (1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (5, 'a'), (6, 'a'), (7, 'a'), (8, 'a'), (9, 'a'), (10, 'a'), (6, 'b'), (7, 'b'), (8, 'b'), (9, 'b'), (10, 'b'), (11, 'b'), (12, 'b'), (13, 'b'), (14, 'b'), (15, 'b')
2020-03-23T10:00:52,388  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] ql.Driver: Query ID = rudip7_20200323100050_ce101e58-7b74-4335-8086-420c7999fe86
2020-03-23T10:00:52,388  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] ql.Driver: Total jobs = 1
2020-03-23T10:00:52,398  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] ql.Driver: Launching Job 1 out of 1
2020-03-23T10:00:52,398  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] ql.Driver: Starting task [Stage-1:MAPRED] in serial mode
2020-03-23T10:00:52,399  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] spark.SparkTask: In order to change the average load for a reducer (in bytes):
2020-03-23T10:00:52,402  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] spark.SparkTask:   set hive.exec.reducers.bytes.per.reducer=<number>
2020-03-23T10:00:52,402  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] spark.SparkTask: In order to limit the maximum number of reducers:
2020-03-23T10:00:52,403  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] spark.SparkTask:   set hive.exec.reducers.max=<number>
2020-03-23T10:00:52,403  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] spark.SparkTask: In order to set a constant number of reducers:
2020-03-23T10:00:52,404  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] spark.SparkTask:   set mapreduce.job.reduces=<number>
2020-03-23T10:00:52,418  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] session.SparkSessionManagerImpl: Setting up the session manager.
2020-03-23T10:00:52,647  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] session.SparkSession: Trying to open Spark session b4fef427-04e7-41d4-a451-7307ddf47d7f
2020-03-23T10:00:52,885  WARN [59732027-99c6-45dd-b44e-f00b10de99b2 main] util.Utils: Your hostname, DESKTOP-67DEOR0 resolves to a loopback address: 127.0.1.1; using 10.8.3.28 instead (on interface eth1)
2020-03-23T10:00:52,886  WARN [59732027-99c6-45dd-b44e-f00b10de99b2 main] util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
2020-03-23T10:00:52,950  WARN [59732027-99c6-45dd-b44e-f00b10de99b2 main] spark.SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.
2020-03-23T10:00:52,974  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: Running client driver with argv: /home/rudip7/spark/bin/spark-submit --properties-file /tmp/spark-submit.483259982061752596.properties --class org.apache.hive.spark.client.RemoteDriver /home/rudip7/hive/lib/hive-exec-3.1.2.jar --remote-host DESKTOP-67DEOR0 --remote-port 63026 --conf hive.spark.client.connect.timeout=1000 --conf hive.spark.client.server.connect.timeout=90000 --conf hive.spark.client.channel.log.level=null --conf hive.spark.client.rpc.max.size=52428800 --conf hive.spark.client.rpc.threads=8 --conf hive.spark.client.secret.bits=256 --conf hive.spark.client.rpc.server.address=null
2020-03-23T10:00:54,353  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: SLF4J: Class path contains multiple SLF4J bindings.
2020-03-23T10:00:54,353  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: SLF4J: Found binding in [jar:file:/home/rudip7/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2020-03-23T10:00:54,353  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: SLF4J: Found binding in [jar:file:/home/rudip7/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2020-03-23T10:00:54,353  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: SLF4J: Found binding in [jar:file:/usr/share/java/slf4j-simple.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2020-03-23T10:00:54,353  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: SLF4J: Found binding in [jar:file:/home/rudip7/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2020-03-23T10:00:54,353  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: SLF4J: Found binding in [jar:file:/home/rudip7/hadoop/share/hadoop/common/lib/slf4j-simple.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2020-03-23T10:00:54,353  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2020-03-23T10:00:54,353  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-03-23T10:00:54,388  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: Warning: Ignoring non-spark config property: hive.spark.client.server.connect.timeout=90000
2020-03-23T10:00:54,388  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: Warning: Ignoring non-spark config property: hive.spark.client.rpc.threads=8
2020-03-23T10:00:54,388  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: Warning: Ignoring non-spark config property: hive.spark.client.connect.timeout=1000
2020-03-23T10:00:54,388  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: Warning: Ignoring non-spark config property: hive.spark.client.secret.bits=256
2020-03-23T10:00:54,388  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: Warning: Ignoring non-spark config property: hive.spark.client.rpc.max.size=52428800
2020-03-23T10:00:54,481  INFO [RemoteDriver-stdout-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: 2020-03-23 10:00:54 WARN  Utils:66 - Your hostname, DESKTOP-67DEOR0 resolves to a loopback address: 127.0.1.1; using 10.8.3.28 instead (on interface eth1)
2020-03-23T10:00:54,481  INFO [RemoteDriver-stdout-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: 2020-03-23 10:00:54 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2020-03-23T10:00:54,540  INFO [RemoteDriver-stdout-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: 2020-03-23 10:00:54 WARN  SparkConf:66 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.
2020-03-23T10:00:54,715  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: Running Spark using the REST application submission protocol.
2020-03-23T10:00:54,734  INFO [RemoteDriver-stdout-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: 2020-03-23 10:00:54 INFO  RestSubmissionClient:54 - Submitting a request to launch an application in spark://localhost:7077.
2020-03-23T10:01:05,111  INFO [RemoteDriver-stdout-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: 2020-03-23 10:01:05 WARN  RestSubmissionClient:66 - Unable to connect to server spark://localhost:7077.
2020-03-23T10:01:05,113  INFO [RemoteDriver-stderr-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: Warning: Master endpoint spark://localhost:7077 was not a REST server. Falling back to legacy submission gateway instead.
2020-03-23T10:01:05,114  INFO [RemoteDriver-stdout-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: 2020-03-23 10:01:05 WARN  SparkConf:66 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.
2020-03-23T10:01:05,250  INFO [RemoteDriver-stdout-redir-59732027-99c6-45dd-b44e-f00b10de99b2 main] client.SparkClientImpl: 2020-03-23 10:01:05 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-03-23T10:01:08,601  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] session.SparkSession: Spark session b4fef427-04e7-41d4-a451-7307ddf47d7f is successfully opened
2020-03-23T10:01:08,619  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] ql.Context: New scratch dir is hdfs://localhost:9000/tmp/hive/rudip7/59732027-99c6-45dd-b44e-f00b10de99b2/hive_2020-03-23_10-00-50_422_8039993009763549672-1
2020-03-23T10:01:09,753  INFO [RPC-Handler-2] client.SparkClientImpl: Received result for 0587cf93-0e64-4879-be4c-16b9811c5b49
2020-03-23T10:01:10,672 ERROR [59732027-99c6-45dd-b44e-f00b10de99b2 main] status.SparkJobMonitor: Job failed with java.lang.NoSuchFieldException: parentOffset
java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
    at org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.<init>(SerializationUtilities.java:388)
    at org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:234)
    at org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:51)
    at org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:278)
    at org.apache.hadoop.hive.ql.exec.spark.KryoSerializer.deserialize(KryoSerializer.java:56)
    at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:341)
    at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:378)
    at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:343)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.NoSuchFieldException: parentOffset
    at java.base/java.lang.Class.getDeclaredField(Class.java:2411)
    at org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.<init>(SerializationUtilities.java:382)
    ... 11 more

2020-03-23T10:01:10,703  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] reexec.ReOptimizePlugin: ReOptimization: retryPossible: false
2020-03-23T10:01:10,705 ERROR [59732027-99c6-45dd-b44e-f00b10de99b2 main] ql.Driver: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed during runtime. Please check stacktrace for the root cause.
2020-03-23T10:01:10,705  INFO [59732027-99c6-45dd-b44e-f00b10de99b2 main] ql.Driver: Completed executing command(queryId=rudip7_20200323100050_ce101e58-7b74-4335-8086-420c7999fe86); Time taken: 18.318 seconds

我正在使用Hive 3.1.2,Spark 2.3.0,Hadoop 3.1.3

I'm using Hive 3.1.2, Spark 2.3.0, Hadoop 3.1.3

而且我正在运行 java 8 ,如您所见此处

And I'm running java 8 as you can see here

有人知道如何解决这个问题吗?预先感谢!

Do anyone know how to solve this problem? Thanks in advance!

推荐答案

问题是Spark执行程序JAVA_HOME设置为 java 11 ,尽管问题中显示的JAVA_HOME全局指向Java 8.为了解决这个问题,我在 spark-env.sh 中添加了以下命令 export JAVA_HOME =/usr/lib/jvm/< java 8> 的路径,错误是不见了.

The problem was that the Spark executor JAVA_HOME was set to java 11, despite that the JAVA_HOME was globally pointing to java 8 as shown in the question. To solve this I added to the spark-env.sh the following command export JAVA_HOME=/usr/lib/jvm/<Path to java 8> and the error is gone.

这篇关于NoSuchFieldException:parentOffset-Hive在Spark上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆