关于Hadoop和压缩输入文件的非常基本的问题 [英] Very basic question about Hadoop and compressed input files

查看：223 发布时间：2016/12/25 12:50:06 compression hadoop

本文介绍了关于Hadoop和压缩输入文件的非常基本的问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经开始研究Hadoop了。如果我的理解是正确的，我可以处理一个非常大的文件，它会得到分裂在不同的节点，然而如果文件被压缩，然后文件不能被分割，并且需要由一个单一的节点处理（有效地破坏的优势运行一个mapreduce ver一个并行机的集群）。

I have started to look into Hadoop. If my understanding is right i could process a very big file and it would get split over different nodes, however if the file is compressed then the file could not be split and wold need to be processed by a single node (effectively destroying the advantage of running a mapreduce ver a cluster of parallel machines).

我的问题是，假设上述是正确的，可以手动分割一个大型文件固定大小的块，或每日块，压缩它们，然后传递压缩输入文件的列表以执行mapreduce？

My question is, assuming the above is correct, is it possible to split a large file manually in fixed-size chunks, or daily chunks, compress them and then pass a list of compressed input files to perform a mapreduce?

关于Hadoop和压缩输入文件的非常基本的问题 [英] Very basic question about Hadoop and compressed input files

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

关于Hadoop和压缩输入文件的非常基本的问题 [英] Very basic question about Hadoop and compressed input files

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭