Hadoop自定义分割TextFile [英] Hadoop custom split of TextFile

查看：105 发布时间：2018/5/31 20:00:53 hadoop

本文介绍了Hadoop自定义分割TextFile的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个相当大的文本文件，我想将其转换为SequenceFile。不幸的是，该文件由Python代码组成，逻辑行通过几条物理线路运行。例如，

打印Blah Blah \

... blah blah

每条逻辑行都以NEWLINE结尾。有人可以澄清我怎么可能在Map-Reduce中生成Key，Value对，其中每个Value都是整个逻辑行？

I have a fairly large text file that I would like to convert into a SequenceFile. Unfortunately, the file consists of Python code with logical lines running over several physical lines. For example,
print "Blah Blah\
... blah blah"
Each logical line is terminated by a NEWLINE. Could someone clarify how I could possibly generate Key, Value pairs in Map-Reduce where each Value is the entire logical line?

Hadoop自定义分割TextFile [英] Hadoop custom split of TextFile

问题描述

推荐答案

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

Hadoop自定义分割TextFile [英] Hadoop custom split of TextFile

问题描述

推荐答案

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭