hadoop问题 [英] hadoop questions

查看:106
本文介绍了hadoop问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想验证下面的示例问题的答案

问题1



您使用hadoop fs -put命令将sales.txt添加到HDFS。这个文件足够小,可以放入一个单独的块中,该块被复制到群集中的三个节点上。何时以及如何在这些节点失效后处理复制?



A。群集将不会尝试重新复制此块。

B.此块将立即被重新复制,并且群集上的所有其他HDFS操作将在此过程中暂停。

C.该块将保持低复制状态,直到管理员手动删除并重新创建该文件。

D.该文件将在NameNode确定复制不足之后自动重新复制该块报告它从DataNodes收到。



我相信答案是 D



问题2



您需要编写代码来执行需要几个步骤的复杂计算。您已决定将这些作业链接在一起,并为存储中间计算结果的密钥开发自定义组合类。这个键必须实现哪个接口?

A。 Writable

B.可转换

C. CompositeSortable

D. WritableComparable



我相信答案是 D



问题3



您正在开发一个使用一年为关键。哪个Hadoop提供的数据类型最适合代表一年的密钥?



A。 Text

B. IntWritable

C.NullWritable

D. BytesWritable

E.这些都不合适。您需要实现一个自定义键。



我相信答案是 B

Hadoop:权威指南第3章中有一节介绍了客户将数据写入HDFS的过程,并描述了这种失败过程如何起作用。



2 - 正确的键必须是可写的可比较的,以便它们不属于可能包含数组和其他非可比类型的其他类别。

3 - 正确的说,一年是一个数值,所以这些最适合选项都是可转换的。

I want to verify the answer for following sample questions

Question 1

You use the hadoop fs -put command to add sales.txt to HDFS. This file is small enough that it fits into a single block, which is replicated to three nodes within your cluster. When and how will the cluster handle replication following the failure of one of these nodes?

A. The cluster will make no attempt to re-replicate this block.
B. This block will be immediately re-replicated and all other HDFS operations on the cluster will halt while this is in progress.
C. The block will remain under-replicated until the administrator manually deletes and recreates the file.
D. The file will be re-replicated automatically after the NameNode determines it is under-replicated based on the block reports it receives from the DataNodes.

I believe the answer is D

Question 2

You need to write code to perform a complex calculation that takes several steps. You have decided to chain these jobs together and develop a custom composite class for the key that stores the results of intermediate calculations. Which interface must this key implement?

A. Writable
B. Transferable
C. CompositeSortable
D. WritableComparable

I believe the answer is D

Question 3

You are developing an application that uses a year for the key. Which Hadoop-supplied data type would be most appropriate for a key that represents a year?

A. Text
B. IntWritable
C. NullWritable
D. BytesWritable
E. None of these would be appropriate. You would need to implement a custom key.

I believe the answer is B

解决方案

1 - Correct, You can find this in any literature that describes the fault tolerance of HDFS. There is a section in Chapter 3 of Hadoop: The Definitive Guide that describes the process of a client writing data to HDFS and they describe how this process of failure works play by play.

2 - Correct, Keys Must be Writable Comparable so that they don't fall into other categories that may include arrays and other non-comparable types.

3 - Correct, a year is a number value so out of all of these the most appriprate option would be intwritable.

这篇关于hadoop问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆