Hive/Hadoop 间歇性故障:无法将源移动到目标 [英] Hive/Hadoop intermittent failure: Unable to move source to destination
问题描述
有一些关于 Hive/Hadoop
无法移动源"错误的 SO 文章.他们中的许多人指出权限问题.
There have been some SO articles about Hive/Hadoop
"Unable to move source" error. Many of them point to permission problem.
然而,在我的网站上我看到了同样的错误,但我很确定它与权限问题无关.这是因为问题是间歇性的 - 一天有效,但在另一天却失败了.
However, in my site I saw the same error but I am quite sure that it is not related to permission problem. This is because the problem is intermittent -- it worked one day but failed on another day.
因此,我更深入地查看了错误消息.它抱怨未能从
I thus looked more deeply into the error message. It was complaining about failing to move from a
.../.hive-stating_hive.../-ext-10000/part-00000-${long-hash}
源路径到目标路径
.../part-00000-${long-hash}
文件夹.这种观察是否会给某人敲响警钟?
folder. Would this observation ring a bell with someone?
这个错误是由一个超级简单的测试查询触发的:只需在测试表中插入一行(见下文)
This error was triggered by a super simple test query: just insert a row into a test table (see below)
错误信息
org.apache.hadoop.hive.ql.metadata.HiveException:
Unable to move source
hdfs://namenodeHA/apps/hive/warehouse/some_db.db/testTable1/.hive-staging_hive_2018-02-02_23-02-13_065_2316479064583526151-5/-ext-10000/part-00000-832944cf-7db4-403b-b02e-55b6e61b1af1-c000
to destination
hdfs://namenodeHA/apps/hive/warehouse/some_db.db/testTable1/part-00000-832944cf-7db4-403b-b02e-55b6e61b1af1-c000;
触发此错误的查询(但只是间歇性的)
insert into testTable1
values (2);
推荐答案
感谢大家的帮助.我找到了解决办法.我在这里提供我自己的答案.
Thanks for all the help. I have found a solution. I am providing my own answer here.
问题在于CTAS"create table as ...
操作在失败的 insert
命令之前,因为文件系统关闭不当.明显的迹象是,将会有一个 IOException: Filesystem closed
消息与失败的 HiveException: Unable to move source ... to destination
操作一起显示.(我从 Spark Thrift Server 中找到了日志消息,而不是我的应用程序日志)
The problem was with a "CTAS" create table as ...
operation that preceded the failing insert
command due to an inappropriate close of the file system. The telltale sign was that there would be an IOException: Filesystem closed
message shown together with the failing HiveException: Unable to move source ... to destination
operation. ( I found the log message from my Spark Thrift Server not my application log )
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:3288)
at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2093)
at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:289)
at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1221)
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2607)
解决方案实际上来自另一篇 SO 文章:https://stackoverflow.com/a/47067350/1168041
The solution was actually from another SO article: https://stackoverflow.com/a/47067350/1168041
但是这里我提供了一个摘录,以防文章不见了:
But here I provide an excerpt in case that article is gone:
将属性添加到 hdfs-site.xml
add the property to hdfs-site.xml
<property>
<name>fs.hdfs.impl.disable.cache</name>
<value>true</value>
</property>
原因:spark 和 hdfs 使用相同的 api(在底部它们使用相同的实例).
Reason: spark and hdfs use the same api (at the bottom they use the same instance).
当直线关闭文件系统实例时.它关闭了 thriftserver 的文件系统实例也是如此.第二条直线尝试获取实例,它将总是报告由:java.io.IOException:文件系统关闭"
When beeline close a filesystem instance . It close the thriftserver's filesystem instance too. Second beeline try to get instance , it will always report "Caused by: java.io.IOException: Filesystem closed"
请在此处检查此问题:
https://issues.apache.org/jira/browse/SPARK-21725
我没有使用 beeline
但 CTAS 的问题是一样的.
I was not using beeline
but the problem with CTAS was the same.
我的测试序列:
insert into testTable1
values (11)
create table anotherTable as select 1
insert into testTable1
values (12)
在修复之前,在 create table as ...
之后,任何插入都会失败修复后,这个问题就消失了.
Before the fix, any insert would failed after the create table as …
After the fix, this problem was gone.
这篇关于Hive/Hadoop 间歇性故障:无法将源移动到目标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!