在 Java 类型的 Hadoop MapReduce 中具有可写包装类的原因是什么? [英] What is the reason for having Writable wrapper classes in Hadoop MapReduce for Java types?

查看:10
本文介绍了在 Java 类型的 Hadoop MapReduce 中具有可写包装类的原因是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我看来,可以编写 org.apache.hadoop.io.serializer.Serialization 来直接序列化 java 类型,其格式与包装类将类型序列化成的格式相同.这样,Mappers 和 Reducers 就不必处理包装类.

It seems to me that a org.apache.hadoop.io.serializer.Serialization could be written to serialize the java types directly in the same format that the wrapper classes serialize the type into. That way the Mappers and Reducers don't have to deal with the wrapper classes.

推荐答案

没有什么可以阻止您更改序列化以使用不同的机制,例如 java Serializable 接口或诸如节俭、协议缓冲区等之类的东西.

There is nothing stopping you changing the serialization to use a different mechanism such as java Serializable interface or something like thrift, protocol buffers etc.

事实上,Hadoop 为 Java Serializable 对象 - 只需配置序列化工厂即可使用它.默认的序列化机制是WritableSerialization,但是可以通过设置以下配置属性来改变:

In fact, Hadoop comes with an (experimental) Serialization implementation for Java Serializable objects - just configure the serialization factory to use it. The default serialization mechanism is WritableSerialization, but this can be changed by setting the following configuration property:

io.serializations=org.apache.hadoop.io.serializer.JavaSerialization

但请记住,任何需要可写(输入/输出格式、分区器、比较器)等的东西都需要替换为可以传递 Serializable 实例而不是 可写实例.

Bear in mind however that anything that expects a Writable (Input/Output formats, partitioners, comparators) etc will need to be replaced by versions that can be passed a Serializable instance rather than a Writable instance.

为好奇的读者提供更多链接:

Some more links for the curious reader:

  • http://www.tom-e-white.com/2008/07/rpc-and-serialization-with-hadoop.html
  • What are the connections and differences between Hadoop Writable and java.io.serialization? - Which seems to be a similar question to what you're asking, and Tariq has a good link to a thread in which Doug Cutting explains the rationale behind using Writables over Serializables

这篇关于在 Java 类型的 Hadoop MapReduce 中具有可写包装类的原因是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆