为什么在使用hadoop fs -put命令时无法启动mapreduce? [英] why mapreduce doesn't get launched when using hadoop fs -put command?
问题描述
请问这个基本问题. 但是我想知道为什么当我们尝试加载一些大小大于块大小的文件时,为什么没有启动mapreduce作业.
Please excuse me for this basic question. But I wonder why mapreduce job don't get launched when we try to load some file having size more than the block size.
我从某个地方了解到MapReduce将负责将数据集从LFS加载到HDFS.那么,为什么在我给hadoop fs -put命令时在控制台上看不到mapreduce日志?
Somewhere I learnt that MapReduce will take care of loading the datasets from LFS to HDFS. Then why I am not able to see mapreduce logs on the console when I give hadoop fs -put command?
预先感谢.
推荐答案
您正在考虑使用hadoop distcp,它将生成MapReduce作业.
You're thinking of hadoop distcp which will spawn a MapReduce job.
https://hadoop.apache.org/docs/stable /hadoop-distcp/DistCp.html
DistCp版本2(分布式副本)是用于大型内部/内部群集复制的工具.它使用MapReduce来实现其分发,错误处理和恢复以及报告.它将文件和目录的列表扩展为映射任务的输入,每个任务都会复制源列表中指定的文件分区.
DistCp Version 2 (distributed copy) is a tool used for large inter/intra cluster copying. It uses MapReduce to effect its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list.
hadoop fs -put
或hdfs dfs -put
完全由HDFS实现,不需要MapReduce.
hadoop fs -put
or hdfs dfs -put
are implemented entirely by HDFS and don't require MapReduce.
这篇关于为什么在使用hadoop fs -put命令时无法启动mapreduce?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!