性能实体序列化:BSON与MessagePack(对比JSON) [英] Performant Entity Serialization: BSON vs MessagePack (vs JSON)

查看:374
本文介绍了性能实体序列化:BSON与MessagePack(对比JSON)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我发现 MessagePack Google的协议缓冲区 JSON 的效果也优于这两者.

Recently I've found MessagePack, an alternative binary serialization format to Google's Protocol Buffers and JSON which also outperforms both.

MongoDB还使用了 BSON 序列化格式.

Also there's the BSON serialization format that is used by MongoDB for storing data.

有人可以详细说明BSON与MessagePack的区别和优缺点吗?

Can somebody elaborate the differences and the dis-/advantages of BSON vs MessagePack?

只需完成高性能二进制序列化格式的列表即可:还有 地精 将成为Google协议缓冲区的替代品.但是,与所有其他提到的格式相比,并不是语言不可知的,它们依赖于 Go的内置反射 还有Gobs库,至少可以使用Go以外的其他语言.

Just to complete the list of performant binary serialization formats: There are also Gobs which are going to be the successor of Google's Protocol Buffers. However in contrast to all the other mentioned formats those are not language-agnostic and rely on Go's built-in reflection there are also Gobs libraries for at least on other language than Go.

推荐答案

//请注意,我是MessagePack的作者.这个答案可能有偏见.

// Please note that I'm author of MessagePack. This answer may be biased.

格式设计

  1. 与JSON的兼容性

  1. Compatibility with JSON

尽管名称如此,但BSON与JSON的兼容性与MessagePack相比并不是很好.

In spite of its name, BSON's compatibility with JSON is not so good compared with MessagePack.

BSON具有特殊类型,例如"ObjectId","Min key","UUID"或"MD5"(我认为MongoDB需要这些类型).这些类型与JSON不兼容.这意味着,当您将对象从BSON转换为JSON时,某些类型信息可能会丢失,但是当然只有当这些特殊类型在BSON源中时,这些信息才会丢失.在单个服务中同时使用JSON和BSON是不利的.

BSON has special types like "ObjectId", "Min key", "UUID" or "MD5" (I think these types are required by MongoDB). These types are not compatible with JSON. That means some type information can be lost when you convert objects from BSON to JSON, but of course only when these special types are in the BSON source. It can be a disadvantage to use both JSON and BSON in single service.

MessagePack旨在从JSON透明转换为

MessagePack is designed to be transparently converted from/to JSON.

MessagePack小于BSON

MessagePack is smaller than BSON

MessagePack的格式不如BSON冗长.结果,MessagePack可以序列化小于BSON的对象.

MessagePack's format is less verbose than BSON. As the result, MessagePack can serialize objects smaller than BSON.

例如,一个简单的映射{"a":1,"b":2}使用MessagePack序列化为7个字节,而BSON使用19个字节.

For example, a simple map {"a":1, "b":2} is serialized in 7 bytes with MessagePack, while BSON uses 19 bytes.

BSON支持就地更新

BSON supports in-place updating

使用BSON,您可以修改存储对象的一部分,而无需重新序列化整个对象.假设将映射{{a:1," b:2}存储在文件中,并且您想将" a的值从1更新为2000.

With BSON, you can modify part of stored object without re-serializing the whole of the object. Let's suppose a map {"a":1, "b":2} is stored in a file and you want to update the value of "a" from 1 to 2000.

对于MessagePack,1仅使用1个字节,而2000使用3个字节.因此,"b"必须向后移2个字节,而"b"则不能修改.

With MessagePack, 1 uses only 1 byte but 2000 uses 3 bytes. So "b" must be moved backward by 2 bytes, while "b" is not modified.

对于BSON,1和2000都使用5个字节.由于这种冗长,您不必移动"b".

With BSON, both 1 and 2000 use 5 bytes. Because of this verbosity, you don't have to move "b".

MessagePack具有RPC

MessagePack has RPC

MessagePack,协议缓冲区,Thrift和Avro支持RPC.但是BSON不会.

MessagePack, Protocol Buffers, Thrift and Avro support RPC. But BSON doesn't.

这些差异意味着MessagePack最初是为网络通信而设计的,而BSON是为存储而设计的.

These differences imply that MessagePack is originally designed for network communication while BSON is designed for storages.

实现和API设计

  1. MessagePack具有类型检查API(Java,C ++和D)

  1. MessagePack has type-checking APIs (Java, C++ and D)

MessagePack支持静态键入.

MessagePack supports static-typing.

与JSON或BSON一起使用的动态类型对于动态语言(如Ruby,Python或JavaScript)很有用.但是麻烦的是静态语言.您必须编写无聊的类型检查代码.

Dynamic-typing used with JSON or BSON are useful for dynamic languages like Ruby, Python or JavaScript. But troublesome for static languages. You must write boring type-checking codes.

MessagePack提供类型检查API.它将动态类型的对象转换为静态类型的对象.这是一个简单的示例(C ++):

MessagePack provides type-checking API. It converts dynamically-typed objects into statically-typed objects. Here is a simple example (C++):

    #include <msgpack.hpp>

    class myclass {
    private:
        std::string str;
        std::vector<int> vec;
    public:
        // This macro enables this class to be serialized/deserialized
        MSGPACK_DEFINE(str, vec);
    };

    int main(void) {
        // serialize
        myclass m1 = ...;

        msgpack::sbuffer buffer;
        msgpack::pack(&buffer, m1);

        // deserialize
        msgpack::unpacked result;
        msgpack::unpack(&result, buffer.data(), buffer.size());

        // you get dynamically-typed object
        msgpack::object obj = result.get();

        // convert it to statically-typed object
        myclass m2 = obj.as<myclass>();
    }

  1. MessagePack具有IDL

  1. MessagePack has IDL

它与类型检查API有关,MessagePack支持IDL. (可以从以下网址获得规范: http://wiki.msgpack.org/display/MSGPACK/Design + of + IDL )

It's related to the type-checking API, MessagePack supports IDL. (specification is available from: http://wiki.msgpack.org/display/MSGPACK/Design+of+IDL)

协议缓冲区和Thrift需要IDL(不支持动态类型),并提供更成熟的IDL实现.

Protocol Buffers and Thrift require IDL (don't support dynamic-typing) and provide more mature IDL implementation.

MessagePack具有流API(Ruby,Python,Java,C ++等)

MessagePack has streaming API (Ruby, Python, Java, C++, ...)

MessagePack支持流式反序列化器.此功能对于网络通信很有用.这是一个示例(Ruby):

MessagePack supports streaming deserializers. This feature is useful for network communication. Here is an example (Ruby):

    require 'msgpack'

    # write objects to stdout
    $stdout.write [1,2,3].to_msgpack
    $stdout.write [1,2,3].to_msgpack

    # read objects from stdin using streaming deserializer
    unpacker = MessagePack::Unpacker.new($stdin)
    # use iterator
    unpacker.each {|obj|
      p obj
    }

这篇关于性能实体序列化:BSON与MessagePack(对比JSON)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆