InputFormat决策 [英] InputFormat Decision
问题描述
我试图找出哪些给出的答案最适合这个问题:
给定一个具有以下结构的文件目录:行号,
制表符,字符串:
Example :
1abialkjfjkaoasdfjksdlkjhqweroij
<2> 2kadfjhuwqounahagtnbvaswslmnbfgy
3kjfteiomndscxeqalkzhtopedkfsikj
您希望将每行作为一条记录发送给Mapper。您应该使用哪个
InputFormat来完成该行:conf.setInputFormat
(____。class); ?
A。 SequenceFileAsTextInputFormat
B。 SequenceFileInputFormat
C。 KeyValueFileInputFormat
D。 BDBInputFormat
我的分析: 选项A 是我发现存在的格式,但我不确定它的正确用法,并且它适合作为答案。
选项B 是不可能的,因为SequenceFiles是二进制数据(K,V)对二进制数据对的文件,因此不适合。
Option C 是不可能的,因为没有KeyValueFileInputFormat,虽然在这里,如果它是一个错字,并且它实际上是KeyValuetextInputFormat,比我认为它会是一个不错的选择。或者是不是?
选项D 是不可能的,因为没有BDBInputFormat,即使它是一个错字,它实际上是BDInputFormat比它不适合这种情况。
谢谢!
D
答案是选项C 。它可能是一个错字
KeyValueTextInputFormat
可帮助您使用TAB分隔线。
因此行号将是关键字,字符串将是该值。
I am trying to figure out which of the given answers suits best the question:
Given a directory of files with the following structure: line number, tab character, string:
1abialkjfjkaoasdfjksdlkjhqweroij
2kadfjhuwqounahagtnbvaswslmnbfgy
3kjfteiomndscxeqalkzhtopedkfsikj
You want to send each line as one record to your Mapper. Which InputFormat should you use to complete the line: conf.setInputFormat (____.class) ; ?
A. SequenceFileAsTextInputFormat
B. SequenceFileInputFormat
C. KeyValueFileInputFormat
D. BDBInputFormat
My analysis:
Option A is a format I found to exist, but I'm not sure of the correct usage of it and if it suits as an answer.
Option B is not possible since SequenceFiles are file of binary data (K,V) pairs of binary data, and thus will not be suitable..
Option C is not possible because there is no KeyValueFileInputFormat, though here, if it is a typo and it actually is KeyValuetextInputFormat, than I think it will be a good choice. Or isn't it?
Option D is not possible because there is no BDBInputFormat and even if it is a typo and it actually is BDInputFormat than it wouldn't suit the case.
Thank You! D
The answer is Option C. It may be a typo
KeyValueTextInputFormat
helps you to get line splitted with TAB.
So line number will be the key and the string will be the value.
这篇关于InputFormat决策的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!