Hadoop - 需要帮助来了解处理步骤 [英] Hadoop - Help required to understand the processing steps

查看:131
本文介绍了Hadoop - 需要帮助来了解处理步骤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有压缩文件,它包含8个大小为5-10kb的xml文件。我将这些数据用于测试目的。我写了一个只有地图的程序来解压缩压缩文件。我在MR2中编写程序,并在伪分布式模式下使用Hadoop 2.7.1 。我使用 sbin / start-dfs.sh 命令启动集群。我能够在几秒钟内在文件系统中看到解压缩的输出,但处理将继续进行下一个5-6分钟。我不知道为什么?





MR程序解压缩文件直到这个阶段,我可以查看/下载这些文件。





无法理解我的mapreduce程序在这里处理的是什么。我在我的代码中使用MR2 API,为什么它在这里使用MR1 API(mapred)?当我有128MB的压缩文件并且它在未压缩的情况下,情况变得更糟分钟和其他时间忙于执行其他任务。



我无法接受并且需要了解在第二个屏幕截图中处理hadoop的性能。

请帮我了解它是安装问题,我的程序问题还是其他问题?

解决方案

这是一个配置问题,我通过修改 mapred-site.xml 文件解决了这个问题。

 <配置> 
<属性>
< name> mapreduce.framework.name< / name>
<值>纱线< /值>
< / property>
< / configuration>


I have compressed file and it contains 8 xml files of size 5-10kb. I took this data for testing purpose. I wrote one map only program to uncompress the compressed file. I wrote program in MR2 and using Hadoop 2.7.1 in psuedo distributed mode. I start the cluster using sbin/start-dfs.sh command. I am able to see the decompressed output in the file system within few seconds but the processing continues for next 5-6 minutes. I don't know why?

MR program uncompressed the files till this stage and I can view / download those files.

Not able to understand what processing my mapreduce program is doing here. I am using MR2 API in my code and why it is using MR1 API(mapred) here? Situation become worse when I have 128mb of zipped files and it uncompressed in 5-10 mins and rest of the time it is busy in doing some other tasks.

The performance I am getting in unacceptable and need to understand what processing hadoop does in 2nd screen shot.

Please help me to understand whether it is installation issue, my program issue or any other issue?

解决方案

This is an config issue and I am resolve this with change in mapred-site.xml file.

<configuration>
<property>  
 <name>mapreduce.framework.name</name>  
 <value>yarn</value>  
 </property>
</configuration>

这篇关于Hadoop - 需要帮助来了解处理步骤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆