Hive的每个Insert查询都会在Hdfs文件系统中创建一个新文件 [英] Hive every Insert query creates a new file in Hdfs file system

查看：109 发布时间：2021/5/13 20:21:23 hadoop hive hdfs

本文介绍了Hive的每个Insert查询都会在Hdfs文件系统中创建一个新文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在每个插入查询中，一个文件都使用000000_0_copy *在Hdfs文件系统中创建.

On every insert query one files gets created with 000000_0_copy* in Hdfs file system.

这是蜂巢和Hdfs的默认行为吗?

Is this the default behaviour of hive and Hdfs ?

如果有的话，是否有压实的概念，那么共作用是如何工作的?

Is there any concept of compaction if yes then How does the comapaction work?

推荐答案

HDFS是仅追加文件系统，意味着修改(UPDATE/DELETE语句)已写入文件的任何部分，必须重写整个文件，然后替换旧文件，或写入新文件以插入单个记录.

HDFS is an append only filesystem, meaning to modify (UPDATE/DELETE statements) any portion of a file that is already written, one must rewrite the entire file and replace the old file, or write a new file to insert even a single record.

压紧不是一个自动过程.您需要编写自己的代码来查询一个表，然后插入另一种格式，例如parquet/orc

Compaction isn't an automatic process. You need to write your own code to query one table, then insert into another format like parquet/orc

这篇关于Hive的每个Insert查询都会在Hdfs文件系统中创建一个新文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Hive的每个Insert查询都会在Hdfs文件系统中创建一个新文件 [英] Hive every Insert query creates a new file in Hdfs file system

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Hive的每个Insert查询都会在Hdfs文件系统中创建一个新文件 [英] Hive every Insert query creates a new file in Hdfs file system

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭