HDP 2.4，如何在一个文件中使用flume收集hadoop mapreduce日志，以及最佳做法是什么 [英] HDP 2.4, How to collect hadoop mapreduce log using flume in one file and what is the best practice

查看：159 发布时间：2018/5/31 20:28:54 hadoop logging mapreduce bigdata

本文介绍了HDP 2.4，如何在一个文件中使用flume收集hadoop mapreduce日志，以及最佳做法是什么的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们正在使用HDP 2.4，并且有许多以各种方式编写的缩减地图作业（Java MR / Hive /等）。日志在应用程序ID下收集在hadoop文件系统中。我想收集应用程序的所有日志，并将其追加到单个文件（一台机器的hdfs或OS文件）中，以便我可以无需麻烦地在单个位置分析我的应用程序日志。还建议我在HDP 2.4中实现的最佳方式（栈版本信息=> HDFS 2.7.1.2.4 / YARN 2.7.1.2.4 / MapReduce2 2.7.1.2.4 / Log Search 0.5.0 / Flume 1.5.2.2.4 ）。

解决方案

为了做到这一点，您需要在指向配置的 yarn.log.dir 的所有NodeManagers上运行Flume代理，并以某种方式解析应用程序/容器/尝试/文件信息从本地操作系统文件路径。

我不确定收集到一个单个文件的效果如何，因为每个容器至少生成5个不同信息的文件，但YARN日志聚合已经完成这个。这只是在HDFS中的可读文件格式，除非您使用Splunk / Hunk，据我所知

另一种解决方案包括将这些文件编入索引到Solr或Elasticsearch等实际搜索服务中，我将推荐它们用于在HDFS上存储和搜索日志

We are using HDP 2.4 and have many map reduce jobs written in various ways ( java MR / Hive / etc. ) . The logs are collect in hadoop file system under the application ID. I want to collect all the logs of application and append in single file (hdfs or OS files of one machine) so that I can analyze my application log in a single location with out hassle . Also advise me the best way to achieve in HDP 2.4 ( Stack version info => HDFS 2.7.1.2.4 / YARN 2.7.1.2.4 / MapReduce2 2.7.1.2.4 / Log Search 0.5.0 / Flume 1.5.2.2.4 ) .

解决方案

Flume cannot collect the logs after they are already on HDFS.

In order to do this, you need a Flume agent running on all NodeManagers pointed at the configured yarn.log.dir, and somehow parse out the application/container/attempt/file information from the local OS file path.

I'm not sure how well collecting into a "single file" would work, as each container generates at least 5 files of different information, but YARN log aggregation already does this. It's just not in a readable file format in HDFS unless you are using Splunk/Hunk, as far as I know

Alternative solutions include indexing these files into actual search services like Solr or Elasticsearch, which I would recommend for storing and searching logs over HDFS

这篇关于HDP 2.4，如何在一个文件中使用flume收集hadoop mapreduce日志，以及最佳做法是什么的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

HDP 2.4，如何在一个文件中使用flume收集hadoop mapreduce日志，以及最佳做法是什么 [英] HDP 2.4, How to collect hadoop mapreduce log using flume in one file and what is the best practice

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

HDP 2.4，如何在一个文件中使用flume收集hadoop mapreduce日志，以及最佳做法是什么 [英] HDP 2.4, How to collect hadoop mapreduce log using flume in one file and what is the best practice

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭