为什么没有'hadoop fs -head'shell命令? [英] Why is there no 'hadoop fs -head' shell command?

查看:222
本文介绍了为什么没有'hadoop fs -head'shell命令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在HDFS上检查文件的快速方法是使用 tail a>:

 〜$ hadoop fs -tail / path / to / file 

这会显示文件中最后一个千字节的数据,这非常有帮助。但是,相反的命令 head 看起来并不是shell命令集合的一部分。我觉得这很令人惊讶。 我的假设是,由于HDFS是为超大文件的非常快的流式读取而构建的,因此存在一些面向访问的问题,它会影响 head 。这让我犹豫不决,无法进入头部。有没有人有答案?

我会说这更多的是与效率有关 - 头可以很容易地通过管道通过linux head命令输出hadoop fs -cat。

  hadoop fs -cat / path / to / file |头

这是非常高效的,因为在输出所需行数后,head将关闭底层流

以这种方式使用尾部的效率会低得多 - 因为您必须遍历整个文件(所有HDFS块)才能找到最后的x行。

  hadoop fs -cat / path / to / file |尾巴

当您注意到hadoop fs -tail命令在最后的kilobyte上工作 - hadoop可以高效地找到最后一个块并跳到最后一个千字节的位置,然后流式输出。通过尾巴管道不能轻易做到这一点。

A fast method for inspecting files on HDFS is to use tail:

~$ hadoop fs -tail /path/to/file

This displays the last kilobyte of data in the file, which is extremely helpful. However, the opposite command head does not appear to be part of the shell command collections. I find this very surprising.

My hypothesis is that since HDFS is built for very fast streaming reads on very large files, there is some access-oriented issue that affects head. This makes me hesitant to do things to access the head. Does anyone have an answer?

解决方案

I would say it's more to do with efficiency - a head can easily be replicated by piping the output of a hadoop fs -cat through the linux head command.

hadoop fs -cat /path/to/file | head

This is efficient as head will close out the underlying stream after the desired number of lines have been output

Using tail in this manner would be considerably less efficient - as you'd have to stream over the entire file (all HDFS blocks) to find the final x number of lines.

hadoop fs -cat /path/to/file | tail

The hadoop fs -tail command as you note works on the last kilobyte - hadoop can efficiently find the last block and skip to the position of the final kilobyte, then stream the output. Piping via tail can't easily do this.

这篇关于为什么没有'hadoop fs -head'shell命令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆