Hadoop HPROF分析没有编写CPU样本 [英] Hadoop HPROF profiling no CPU SAMPLES written
问题描述
我想使用HPROF来分析我的Hadoop作业。问题是我得到 TRACES
,但 profile.out中没有
文件。我在run方法中使用的代码是: CPU SAMPLES
/ **获取配置* /
配置conf = getConf();
conf.set(textinputformat.record.delimiter,\\\
\\\
);
conf.setStrings(args,args);
/ ** JVM PROFILING * /
conf.setBoolean(mapreduce.task.profile,true);
conf.set(mapreduce.task.profile.params,-agentlib:hprof = cpu = samples,+
heap = sites,depth = 6,force = n,thread = y ,冗长= N,文件=%S);
conf.set(mapreduce.task.profile.maps,0-2);
conf.set(mapreduce.task.profile.reduces,);
/ **作业配置* /
作业作业=新作业(conf,HadoopSearch);
job.setJarByClass(Search.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
/ **设置Mapper和Reducer,使用identity reducer * /
job.setMapperClass(Map.class);
job.setReducerClass(Reducer.class);
/ **设置输入和输出格式* /
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
$ b $ / **设置输入和输出路径* /
FileInputFormat.addInputPath(job,new Path(/ user / niko / 16M));
FileOutputFormat.setOutputPath(job,new Path(cmd.getOptionValue(output)));
job.waitForCompletion(true);
返回0;
如何获得 CPU SAMPLES
到写在输出中?
我也在 stderr
上有错误消息,但我认为它不是因为当分析被设置为false或启用分析的代码被注释掉时它也存在。错误是:
log4j:WARN记录器没有找到appender(org.apache.hadoop.metrics2.impl.MetricsSystemImpl) 。
log4j:WARN请正确初始化log4j系统。
log4j:WARN请参阅http://logging.apache.org/log4j/1.2/faq.html#noconfig了解更多信息。
纱线(或MRv1)你的工作完成。
无法将CPU样本写入分析文件。事实上,您的痕迹也应该被截断。
您必须添加folowwing选项(或Hadoop版本中的等效项):
yarn.nodemanager.sleep-delay-before-sigkill.ms = 30000
发送一个SIGTERM和SIGKILL到一个容器
yarn.nodemanager.process-kill-wait.ms = 30000
#尝试清理容器时等待进程出现的最长时间
mapreduce.tasktracker.tasks.sleeptimebeforesigkill = 30000
#Same en MRv1?
(30秒似乎就够了)
I want to use HPROF to profile my Hadoop job. The problem is that I get TRACES
but there is no CPU SAMPLES
in the profile.out
file. The code that I am using inside my run method is:
/** Get configuration */
Configuration conf = getConf();
conf.set("textinputformat.record.delimiter","\n\n");
conf.setStrings("args", args);
/** JVM PROFILING */
conf.setBoolean("mapreduce.task.profile", true);
conf.set("mapreduce.task.profile.params", "-agentlib:hprof=cpu=samples," +
"heap=sites,depth=6,force=n,thread=y,verbose=n,file=%s");
conf.set("mapreduce.task.profile.maps", "0-2");
conf.set("mapreduce.task.profile.reduces", "");
/** Job configuration */
Job job = new Job(conf, "HadoopSearch");
job.setJarByClass(Search.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
/** Set Mapper and Reducer, use identity reducer*/
job.setMapperClass(Map.class);
job.setReducerClass(Reducer.class);
/** Set input and output formats */
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
/** Set input and output path */
FileInputFormat.addInputPath(job, new Path("/user/niko/16M"));
FileOutputFormat.setOutputPath(job, new Path(cmd.getOptionValue("output")));
job.waitForCompletion(true);
return 0;
How do I get the CPU SAMPLES
to be written in the output?
I also have s trange error message on the stderr
but I think it is not related, since it is present also when the profiling is set to false or the code for enabling profiling is commented out. The error is
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Yarn (or MRv1) is killing the container just after your job finish. The CPU Samples can't be wrote on your profiling file. In fact, your traces should be truncated also.
You have to add the folowwing option (or the equivalent on your Hadoop version) :
yarn.nodemanager.sleep-delay-before-sigkill.ms = 30000
# No. of ms to wait between sending a SIGTERM and SIGKILL to a container
yarn.nodemanager.process-kill-wait.ms = 30000
# Max time to wait for a process to come up when trying to cleanup a container
mapreduce.tasktracker.tasks.sleeptimebeforesigkill = 30000
# Same en MRv1 ?
(30 sec seems to enough)
这篇关于Hadoop HPROF分析没有编写CPU样本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!