如何避免为Hive查询生成空的.deflate文件? [英] How to avoid generating empty .deflate files for a Hive query?

查看:238
本文介绍了如何避免为Hive查询生成空的.deflate文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我运行一个Hive查询时,会生成大量空的 .deflate 文件(它们实际上是大约8个字节,我认为它是 .deflate 文件)。我怀疑这是因为查询需要大量的reducer。我想知道是否有一种方法可以避免生成这些空的 .deflate 文件?



/ p>


解决方案

。 deflate 是默认的 压缩编解码器

Hive 的压缩设置可用于减少 Hive的磁盘空间量 查询



当属性 hive.exec.compress.output = true 时, Hive 将会使用由 mapred.map.output.compression.codec 配置的 codec 属性来压缩 HDFS 中的存储。这些属性可以在 hive.site.xml Hive-CLI > Hive-CLI 中启用输出压缩, strong>。:

> hive> set hive.exec.compress.output = true;



使用 hive.site.xml

 < property> 
< name> hive.exec.compress.output< / name>
<值> true< /值>
< / property>

因此,要禁用 .deflate file:



set hive.exec.compress.output = false;


When I run a Hive query, a large number of empty .deflate files are generated (they are actually about 8 bytes, which I think is the minimum size for a .deflate file). I suspect this is happening because the query requires a large number of reducers. I am wondering if there is a way to avoid generating these empty .deflate files?

Thanks in advance,

Lin

解决方案

.deflate is the default compression codec

There are compression settings for Hive that can be used to reduce the amount of disk space that Hive uses for its queries.

When the property hive.exec.compress.output=true, Hive will use the codec configured by the mapred.map.output.compression.codec property to compress the storage in HDFS. These properties can be set in the hive.site.xml or in the Hive-CLI.

To enable output compression from Hive-CLI.:

hive> set hive.exec.compress.output=true;

To enable output compression using hive.site.xml

<property>
 <name>hive.exec.compress.output</name>
 <value>true</value>
</property>

So to disable the .deflate file:

set hive.exec.compress.output=false;

这篇关于如何避免为Hive查询生成空的.deflate文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆