如果分区目录不存在，聚合查询在配置单元中失败 [英] Aggregate queries fail in hive if partition directory doesn't exist

查看：332 发布时间：2018/6/12 14:21:10 hadoop hive hadoop-partitioning

本文介绍了如果分区目录不存在，聚合查询在配置单元中失败的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在Tez使用Hive v1.2.1。我有一个外部分区表。分区按小时计算，形式为p = yyyy_mm_dd_hh。情况是这些hdfs中的分区目录有可能在某个时候被删除。删除之后，配置单元仍包含该分区的元数据，并且命令show partitions仍会列出其目录已从hdfs中删除的分区。通常情况下，这不太可能导致任何问题，并且对该分区（其目录被删除）的select查询只会导致一个空的结果集：

蜂房> select * from test_tab where p ='2015_01_01_01'; OK 所用时间：2.168秒
但是，在运行任何聚合查询对于同一个分区，我得到一个错误：

hive>从test_tab中选择count（*），其中p ='2015_01_01_01'; FAILED：SemanticException java.io.FileNotFoundException：文件hdfs：// localhost：8020 / user / root / data / test_db / test_tab / p = 2015_01_01_01不存在。
我需要在聚合查询中具有与其他select查询中相同的行为。这可能是蜂巢中的一个错误。任何解决方法 - 这个问题的提示将不胜感激。最好的问候。
解决方案
运行下面的命令

msck repair table test_tab;

然后运行您的查询

I am using Hive v1.2.1 with Tez. I have an external partitioned table. The partitions are hourly and of the form p=yyyy_mm_dd_hh. The situation is that these partition directories in hdfs are likely to be deleted sometime. After they are deleted, hive still contains the metadata for that partition, and a command 'show partitions ' would still list the partition whose directory was deleted from hdfs. Normally, this is not likely to cause any problem, and a select query for the partition(whose directory was deleted) would simply result an empty resultset:
hive> select * from test_tab where p='2015_01_01_01'; OK Time taken: 2.168 seconds
However, on running any aggregate query against the same partition, I get an error:
hive> select count(*) from test_tab where p='2015_01_01_01'; FAILED: SemanticException java.io.FileNotFoundException: File hdfs://localhost:8020/user/root/data/test_db/test_tab/p=2015_01_01_01 does not exist.
I need to have the same behavior in aggregate queries as that in other select queries. This is probably a bug in hive. Any workaround-hints for this issue would be appreciated. Best Regards.
解决方案
run below command

msck repair table test_tab;

and then run your query

这篇关于如果分区目录不存在，聚合查询在配置单元中失败的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如果分区目录不存在，聚合查询在配置单元中失败 [英] Aggregate queries fail in hive if partition directory doesn't exist

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如果分区目录不存在，聚合查询在配置单元中失败 [英] Aggregate queries fail in hive if partition directory doesn&#39;t exist

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

如果分区目录不存在，聚合查询在配置单元中失败 [英] Aggregate queries fail in hive if partition directory doesn't exist

登录关闭