Spark sql 2.1.1 节俭服务器 - 无法将源 hdfs 移动到目标 [英] Spark sql 2.1.1 thrift server - unable to move source hdfs to target

查看:43
本文介绍了Spark sql 2.1.1 节俭服务器 - 无法将源 hdfs 移动到目标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

与这个问题有关[create table xxx as select * from yyy有时会出错

It is related to this question [create table xxx as select * from yyy sometimes get error

]1

当使用spark thrift server时,执行多条语句,如create table xxx as select * from yyy,只有第一次成功,以后总是失败,因为java.io.IOException: 文件系统关闭,或 doAs 问题.

When using spark thrift server, execute multiple statement like create table xxx as select * from yyy, only first time will success, later tries will always fail, due to java.io.IOException: Filesystem closed, or doAs problems.

完整的错误堆栈跟踪:

The full error stack trace:

17/05/29 08:44:53 ERROR thriftserver.SparkExecuteStatementOperation: Error executing query, currentState RUNNING,
org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_107/.hive-staging_hive_2017-05-29_08-44-50_607_2388239917764085229-3/-ext-10000/part-00000 to destination hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_107/part-00000;
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
    at org.apache.spark.sql.hive.HiveExternalCatalog.loadTable(HiveExternalCatalog.scala:766)
    at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:374)
    at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:221)
    at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:407)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
    at org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand.run(CreateHiveTableAsSelectCommand.scala:92)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
    at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:231)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:174)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:184)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_107/.hive-staging_hive_2017-05-29_08-44-50_607_2388239917764085229-3/-ext-10000/part-00000 to destination hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_107/part-00000
    at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2644)
    at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2892)
    at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1640)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.sql.hive.client.Shim_v0_14.loadTable(HiveShim.scala:728)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply$mcV$sp(HiveClientImpl.scala:676)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply(HiveClientImpl.scala:676)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply(HiveClientImpl.scala:676)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:279)
    at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:226)
    at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:225)
    at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:268)
    at org.apache.spark.sql.hive.client.HiveClientImpl.loadTable(HiveClientImpl.scala:675)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply$mcV$sp(HiveExternalCatalog.scala:768)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply(HiveExternalCatalog.scala:766)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply(HiveExternalCatalog.scala:766)
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
    ... 40 more
Caused by: java.io.IOException: Filesystem closed
    at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:798)
    at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:2966)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:1906)
    at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:262)
    at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1221)
    at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2607)
    ... 59 more
17/05/29 08:44:53 ERROR thriftserver.SparkExecuteStatementOperation: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_107/.hive-staging_hive_2017-05-29_08-44-50_607_2388239917764085229-3/-ext-10000/part-00000 to destination hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_107/part-00000;
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:266)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:174)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:184)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

这是一个普通的create table select as log

This is a normal create table select as log

17/05/29 08:42:30 INFO cluster.YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
17/05/29 08:42:30 INFO scheduler.DAGScheduler: ResultStage 1 (run at AccessController.java:0) finished in 2.079 s
17/05/29 08:42:30 INFO scheduler.DAGScheduler: Job 1 finished: run at AccessController.java:0, took 2.100557 s
17/05/29 08:42:30 INFO metastore.HiveMetaStore: 2: get_table : db=task tbl=task_106
17/05/29 08:42:30 INFO HiveMetaStore.audit: ugi=root    ip=unknown-ip-addr  cmd=get_table : db=task tbl=task_106    
17/05/29 08:42:30 INFO metastore.HiveMetaStore: 2: get_table : db=task tbl=task_106
17/05/29 08:42:30 INFO HiveMetaStore.audit: ugi=root    ip=unknown-ip-addr  cmd=get_table : db=task tbl=task_106    
17/05/29 08:42:30 INFO metadata.Hive: Replacing src:hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_106/.hive-staging_hive_2017-05-29_08-42-26_232_2514893773205547001-1/-ext-10000/part-00000, dest: hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_106/part-00000, Status:true
17/05/29 08:42:30 INFO metadata.Hive: Replacing src:hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_106/.hive-staging_hive_2017-05-29_08-42-26_232_2514893773205547001-1/-ext-10000/part-00001, dest: hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_106/part-00001, Status:true

这是失败的一个,在一些get_table之后,它正在执行一些drop_table,然后导致Filesystem.close,最后<代码>无法移动源

This is the fail one, after some get_table, it is executing some drop_table, and then cause the Filesystem.close, finally unable to move source

17/05/29 08:42:50 INFO cluster.YarnScheduler: Removed TaskSet 6.0, whose tasks have all completed, from pool
17/05/29 08:42:50 INFO scheduler.DAGScheduler: ResultStage 6 (run at AccessController.java:0) finished in 2.567 s
17/05/29 08:42:50 INFO scheduler.DAGScheduler: Job 3 finished: run at AccessController.java:0, took 2.819549 s
17/05/29 08:42:51 INFO metastore.HiveMetaStore: 6: get_table : db=task tbl=task_107
17/05/29 08:42:51 INFO HiveMetaStore.audit: ugi=root    ip=unknown-ip-addr  cmd=get_table : db=task tbl=task_107    
17/05/29 08:42:51 INFO metastore.HiveMetaStore: 6: get_table : db=task tbl=task_107
17/05/29 08:42:51 INFO HiveMetaStore.audit: ugi=root    ip=unknown-ip-addr  cmd=get_table : db=task tbl=task_107    
17/05/29 08:42:51 INFO metastore.HiveMetaStore: 6: get_database: task
17/05/29 08:42:51 INFO HiveMetaStore.audit: ugi=root    ip=unknown-ip-addr  cmd=get_database: task  
17/05/29 08:42:51 INFO metastore.HiveMetaStore: 6: get_table : db=task tbl=task_107
17/05/29 08:42:51 INFO HiveMetaStore.audit: ugi=root    ip=unknown-ip-addr  cmd=get_table : db=task tbl=task_107    
17/05/29 08:42:51 INFO metastore.HiveMetaStore: 6: get_database: task
17/05/29 08:42:51 INFO HiveMetaStore.audit: ugi=root    ip=unknown-ip-addr  cmd=get_database: task  
17/05/29 08:42:51 INFO metastore.HiveMetaStore: 6: get_table : db=task tbl=task_107
17/05/29 08:42:51 INFO HiveMetaStore.audit: ugi=root    ip=unknown-ip-addr  cmd=get_table : db=task tbl=task_107    
17/05/29 08:42:51 INFO metastore.HiveMetaStore: 6: drop_table : db=task tbl=task_107
17/05/29 08:42:51 INFO HiveMetaStore.audit: ugi=root    ip=unknown-ip-addr  cmd=drop_table : db=task tbl=task_107   
17/05/29 08:42:51 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/05/29 08:42:51 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/05/29 08:42:51 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/05/29 08:42:51 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/05/29 08:42:51 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/05/29 08:42:51 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/05/29 08:42:52 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/05/29 08:42:52 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/05/29 08:42:52 INFO metastore.hivemetastoressimpl: deleting  hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_107
17/05/29 08:42:52 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
17/05/29 08:42:52 INFO metastore.hivemetastoressimpl: Deleted the diretory hdfs://jzf-01:9000/user/hive/warehouse/task.db/task_107
17/05/29 08:42:52 ERROR thriftserver.SparkExecuteStatementOperation: Error executing query, currentState RUNNING,

推荐答案

尝试在你的 hive-site.xml 中设置 hive.exec.staging-dir 像这样:

Try setting hive.exec.staging-dir in your hive-site.xml like this:

<property>
  <name>hive.exec.stagingdir</name>
  <value>/tmp/hive/spark-${user.name}</value>
</property>

这适用于从 1.6.2 升级到 2.1.1 并且在使用 CTAS 时遇到相同问题的客户.在我们的开发集群上,这样做让我们克服了您的特定错误,但我们仍然有一些 HDFS 权限问题正在解决中.

This worked for a customer who upgraded from 1.6.2 to 2.1.1 and who had that same problem with CTAS. On our dev cluster, doing this got us past your particular error, but we still have some HDFS permission issues we are working through.

希望这会有所帮助.

这篇关于Spark sql 2.1.1 节俭服务器 - 无法将源 hdfs 移动到目标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆