高性能实体序列化:BSON 与 MessagePack(与 JSON) [英] Performant Entity Serialization: BSON vs MessagePack (vs JSON)

查看:66
本文介绍了高性能实体序列化:BSON 与 MessagePack(与 JSON)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我发现 MessagePack,一个Google 的协议缓冲区二进制序列化格式"http://json.org/" rel="noreferrer">JSON 也优于两者.

Recently I've found MessagePack, an alternative binary serialization format to Google's Protocol Buffers and JSON which also outperforms both.

还有 MongoDB 用于存储数据的 BSON 序列化格式.

Also there's the BSON serialization format that is used by MongoDB for storing data.

有人可以详细说明BSON 与 MessagePack 的区别和优缺点吗?

只是为了完成性能二进制序列化格式的列表:还有Gobs 将成为 Google Protocol Buffers 的继任者.然而 与所有其他提到的格式相比,这些格式与语言无关并且依赖于 Go 的内置反射 至少还有其他语言的 Gobs 库,而不是 Go.

Just to complete the list of performant binary serialization formats: There are also Gobs which are going to be the successor of Google's Protocol Buffers. However in contrast to all the other mentioned formats those are not language-agnostic and rely on Go's built-in reflection there are also Gobs libraries for at least on other language than Go.

推荐答案

//请注意,我是 MessagePack 的作者.这个答案可能有偏见.

// Please note that I'm author of MessagePack. This answer may be biased.

格式设计

  1. 与 JSON 的兼容性

  1. Compatibility with JSON

尽管名称如此,但与 MessagePack 相比,BSON 与 JSON 的兼容性并没有那么好.

In spite of its name, BSON's compatibility with JSON is not so good compared with MessagePack.

BSON 具有特殊类型,例如ObjectId"、Min key"、UUID"或MD5"(我认为 MongoDB 需要这些类型).这些类型与 JSON 不兼容.这意味着当您将对象从 BSON 转换为 JSON 时,某些类型信息可能会丢失,但当然只有当这些特殊类型在 BSON 源中时才会丢失.在单个服务中同时使用 JSON 和 BSON 可能是一个缺点.

BSON has special types like "ObjectId", "Min key", "UUID" or "MD5" (I think these types are required by MongoDB). These types are not compatible with JSON. That means some type information can be lost when you convert objects from BSON to JSON, but of course only when these special types are in the BSON source. It can be a disadvantage to use both JSON and BSON in single service.

MessagePack 旨在透明地从/向 JSON 转换.

MessagePack is designed to be transparently converted from/to JSON.

MessagePack 比 BSON 小

MessagePack is smaller than BSON

MessagePack 的格式没有 BSON 冗长.因此,MessagePack 可以序列化小于 BSON 的对象.

MessagePack's format is less verbose than BSON. As the result, MessagePack can serialize objects smaller than BSON.

例如,一个简单的映射 {"a":1, "b":2} 使用 MessagePack 序列化为 7 个字节,而 BSON 使用 19 个字节.

For example, a simple map {"a":1, "b":2} is serialized in 7 bytes with MessagePack, while BSON uses 19 bytes.

BSON 支持就地更新

BSON supports in-place updating

使用 BSON,您可以修改部分存储对象,而无需重新序列化整个对象.假设一个映射 {"a":1, "b":2} 存储在一个文件中,并且您想要将 "a" 的值从 1 更新为 2000.

With BSON, you can modify part of stored object without re-serializing the whole of the object. Let's suppose a map {"a":1, "b":2} is stored in a file and you want to update the value of "a" from 1 to 2000.

对于 MessagePack,1 只使用 1 个字节,而 2000 使用 3 个字节.所以b"必须向后移动2个字节,而b"没有被修改.

With MessagePack, 1 uses only 1 byte but 2000 uses 3 bytes. So "b" must be moved backward by 2 bytes, while "b" is not modified.

使用 BSON,1 和 2000 都使用 5 个字节.由于如此冗长,您不必移动b".

With BSON, both 1 and 2000 use 5 bytes. Because of this verbosity, you don't have to move "b".

MessagePack 有 RPC

MessagePack has RPC

MessagePack、Protocol Buffers、Thrift 和 Avro 支持 RPC.但 BSON 没有.

MessagePack, Protocol Buffers, Thrift and Avro support RPC. But BSON doesn't.

这些差异意味着 MessagePack 最初是为网络通信而设计的,而 BSON 是为存储而设计的.

These differences imply that MessagePack is originally designed for network communication while BSON is designed for storages.

实现和 API 设计

  1. MessagePack 具有类型检查 API(Java、C++ 和 D)

  1. MessagePack has type-checking APIs (Java, C++ and D)

MessagePack 支持静态类型.

MessagePack supports static-typing.

与 JSON 或 BSON 一起使用的动态类型对于 Ruby、Python 或 JavaScript 等动态语言非常有用.但是对于静态语言来说很麻烦.你必须编写无聊的类型检查代码.

Dynamic-typing used with JSON or BSON are useful for dynamic languages like Ruby, Python or JavaScript. But troublesome for static languages. You must write boring type-checking codes.

MessagePack 提供类型检查 API.它将动态类型的对象转换为静态类型的对象.这是一个简单的例子(C++):

MessagePack provides type-checking API. It converts dynamically-typed objects into statically-typed objects. Here is a simple example (C++):

    #include <msgpack.hpp>

    class myclass {
    private:
        std::string str;
        std::vector<int> vec;
    public:
        // This macro enables this class to be serialized/deserialized
        MSGPACK_DEFINE(str, vec);
    };

    int main(void) {
        // serialize
        myclass m1 = ...;

        msgpack::sbuffer buffer;
        msgpack::pack(&buffer, m1);

        // deserialize
        msgpack::unpacked result;
        msgpack::unpack(&result, buffer.data(), buffer.size());

        // you get dynamically-typed object
        msgpack::object obj = result.get();

        // convert it to statically-typed object
        myclass m2 = obj.as<myclass>();
    }

  1. MessagePack 有 IDL

  1. MessagePack has IDL

与类型检查API有关,MessagePack支持IDL.(规范可从:http://wiki.msgpack.org/display/MSGPACK/Design+of+IDL)

It's related to the type-checking API, MessagePack supports IDL. (specification is available from: http://wiki.msgpack.org/display/MSGPACK/Design+of+IDL)

Protocol Buffers 和 Thrift 需要 IDL(不支持动态类型)并提供更成熟的 IDL 实现.

Protocol Buffers and Thrift require IDL (don't support dynamic-typing) and provide more mature IDL implementation.

MessagePack 具有流式 API(Ruby、Python、Java、C++ 等)

MessagePack has streaming API (Ruby, Python, Java, C++, ...)

MessagePack 支持流式解串器.此功能对于网络通信很有用.这是一个示例(Ruby):

MessagePack supports streaming deserializers. This feature is useful for network communication. Here is an example (Ruby):

    require 'msgpack'

    # write objects to stdout
    $stdout.write [1,2,3].to_msgpack
    $stdout.write [1,2,3].to_msgpack

    # read objects from stdin using streaming deserializer
    unpacker = MessagePack::Unpacker.new($stdin)
    # use iterator
    unpacker.each {|obj|
      p obj
    }

这篇关于高性能实体序列化:BSON 与 MessagePack(与 JSON)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆