在Hadoop中使用NullWritable的优点 [英] Advantages of using NullWritable in Hadoop

查看：591 发布时间：2018/5/31 19:04:15 java hadoop mapreduce

本文介绍了在Hadoop中使用NullWritable的优点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用 NullWritable 用于 null 键/值使用 null的优点是什么文本（即 new Text（null））。我从Hadoop：权威指南一书中看到以下内容。

NullWritable 是 Writable 的特殊类型，因为它具有零长度序列化。没有字节
被写入或读取流。它用作占位符;例如，在
MapReduce中，如果您不需要
来使用该位置，则可以将键或值声明为 NullWritable 有效地存储一个恒定的空值。当你想存储一个值列表时，NullWritable也可以作为 SequenceFile 中的一个键来使用，而
是键值对。它是一个不可变的单例：通过调用
NullWritable.get（）

我不清楚如何使用 NullWritable 写出输出。在初始输出文件中是否会有单个常量值指示该文件的键或值是 null ，以便MapReduce框架可以忽略读取 null 键/值（以 null 为准）？另外， null 文本是如何实际序列化的？

谢谢，

解决方案

键/值类型必须在运行时给出，所以任何写入或读取 NullWritables 将提前知道它将处理该类型;文件中没有标记或任何内容。从技术上讲， NullWritables 是read，它只是读取一个 NullWritable 实际上是一个空操作。你可以看到自己没有任何书面或阅读：

NullWritable nw = NullWritable.get（）; ByteArrayOutputStream out = new ByteArrayOutputStream（）; nw.write（new DataOutputStream（out））; System.out.println（Arrays.toString（out.toByteArray（）））; //打印[] ByteArrayInputStream in = new ByteArrayInputStream（new byte [0]）; nw.readFields（new DataInputStream（in））; //正常工作
至于你关于的问题new Text（null ），你可以尝试一下：
Text text = new Text（（String）空值）; ByteArrayOutputStream out = new ByteArrayOutputStream（）; text.write（new DataOutputStream（out））; //抛出NullPointerException异常 System.out.println（Arrays.toString（out.toByteArray（）））;
文字根本无法使用 null 字符串。
What are the advantages of using NullWritable for null keys/values over using null texts (i.e. new Text(null)). I see the following from the «Hadoop: The Definitive Guide» book. NullWritable is a special type of Writable, as it has a zero-length serialization. No bytes are written to, or read from, the stream. It is used as a placeholder; for example, in MapReduce, a key or a value can be declared as a NullWritable when you don’t need to use that position—it effectively stores a constant empty value. NullWritable can also be useful as a key in SequenceFile when you want to store a list of values, as opposed to key-value pairs. It is an immutable singleton: the instance can be retrieved by calling NullWritable.get() I do not clearly understand how the output is written out using NullWritable? Will there be a single constant value in the beginning output file indicating that the keys or values of this file are null, so that the MapReduce framework can ignore reading the null keys/values (whichever is null)? Also, how actually are null texts serialized? Thanks, Venkat 解决方案 The key/value types must be given at runtime, so anything writing or reading NullWritables will know ahead of time that it will be dealing with that type; there is no marker or anything in the file. And technically the NullWritables are "read", it's just that "reading" a NullWritable is actually a no-op. You can see for yourself that there's nothing at all written or read: NullWritable nw = NullWritable.get(); ByteArrayOutputStream out = new ByteArrayOutputStream(); nw.write(new DataOutputStream(out)); System.out.println(Arrays.toString(out.toByteArray())); // prints "[]" ByteArrayInputStream in = new ByteArrayInputStream(new byte[0]); nw.readFields(new DataInputStream(in)); // works just fine And as for your question about new Text(null), again, you can try it out: Text text = new Text((String)null); ByteArrayOutputStream out = new ByteArrayOutputStream(); text.write(new DataOutputStream(out)); // throws NullPointerException System.out.println(Arrays.toString(out.toByteArray())); Text will not work at all with a null String. 这篇关于在Hadoop中使用NullWritable的优点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Hadoop中使用NullWritable的优点 [英] Advantages of using NullWritable in Hadoop

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

在Hadoop中使用NullWritable的优点 [英] Advantages of using NullWritable in Hadoop

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭