增强二进制存档-减小大小 [英] Boost binary archives - reducing size

查看:135
本文介绍了增强二进制存档-减小大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试减小C ++中boost存档的内存大小.

I am trying to reduce the memory size of boost archives in C++.

我发现的一个问题是,Boost的二进制归档文件默认对任何int使用4个字节,而不管其大小如何.因此,我得到一个空的boost二进制归档文件占用62个字节,而一个空的文本归档文件占用40个字节(一个空文本归档文件的文本表示:22 serialization::archive 14 0 0 1 0 0 0 0 0).

One problem I have found is that Boost's binary archives default to using 4 bytes for any int, regardless of its magnitude. For this reason, I am getting that an empty boost binary archive takes 62 bytes while an empty text archive takes 40 (text representation of an empty text archive: 22 serialization::archive 14 0 0 1 0 0 0 0 0).

是否有任何方法可以更改此int的默认行为?

Is there any way to change this default behavior for ints?

否则,除了对向量使用make_array之外,还有其他方法可以优化二进制归档文件的大小吗?

Else, are there any other ways to optimize the size of a binary archive apart from using make_array for vectors?

推荐答案

问. 我正在尝试减小C ++中Boost存档的内存大小.

请参见提高C ++序列化的开销

问. 我发现的一个问题是,Boost的二进制归档文件默认对任何int使用4个字节,而不论其大小如何.

那是因为它是一个序列化库,而不是压缩库

That's because it's a serialization library, not a compression library

问. 由于这个原因,我得到一个空的boost二进制归档文件占用62个字节,而一个空的文本归档文件占用40个字节(一个空文本归档文件的文本表示形式:22序列化: :archive 14 0 0 1 0 0 0 0 0).

使用存档标志:例如来自 Boost序列化:如何预测序列化结果的大小?:

Use the archive flags: e.g. from Boost Serialization : How To Predict The Size Of The Serialized Result?:

  • 调优(boost :: archive :: no_codecvt,boost :: archive :: no_header,
  • Tune things (boost::archive::no_codecvt, boost::archive::no_header, disable tracking etc.)

  • 问. 有什么办法可以更改int的默认行为?

    不.不过有BOOST_IS_BITWISE_SERIALIZABLE(T)(有关示例和说明,请参见例如 Boost串行按位串行化).

    No. There is BOOST_IS_BITWISE_SERIALIZABLE(T) though (see e.g. Boost serialization bitwise serializability for an example and explanations).

    问. 还有,除了对向量使用make_array之外,还有其他方法可以优化二进制归档文件的大小吗?

    Q. Else, are there any other ways to optimize the size of a binary archive apart from using make_array for vectors?

    使用make_array对于vector<int>无效:

    在Coliru上直播

    #include <boost/archive/binary_oarchive.hpp>
    #include <boost/serialization/vector.hpp>
    #include <sstream>
    #include <iostream>
    
    static auto const flags = boost::archive::no_header | boost::archive::no_tracking;
    
    template <typename T>
    std::string direct(T const& v) {
        std::ostringstream oss;
        {
            boost::archive::binary_oarchive oa(oss, flags);
            oa << v;
        }
        return oss.str();
    }
    
    template <typename T>
    std::string as_pod_array(T const& v) {
        std::ostringstream oss;
        {
            boost::archive::binary_oarchive oa(oss, flags);
            oa << v.size() << boost::serialization::make_array(v.data(), v.size());
        }
        return oss.str();
    }
    
    int main() {
        std::vector<int> i(100);
        std::cout << "direct: "       << direct(i).size() << "\n";
        std::cout << "as_pod_array: " << as_pod_array(i).size() << "\n";
    }
    

    打印

    direct: 408
    as_pod_array: 408
    

  • 压缩

    最直接的优化方法是压缩结果流(另请参阅添加的基准测试这里).

    除非如此,否则您将不得不覆盖默认的序列化并应用您自己的压缩(可以是简单的行程编码,霍夫曼编码或更多特定于域的压缩).

    Barring that, you will have to override default serialization and apply your own compression (which could be a simple run-length encoding, huffman coding or something more domain specific).

    在Coliru上直播

    #include <boost/archive/binary_oarchive.hpp>
    #include <boost/serialization/vector.hpp>
    #include <sstream>
    #include <iostream>
    #include <boost/iostreams/filter/bzip2.hpp>
    #include <boost/iostreams/filtering_stream.hpp>
    #include <boost/iostreams/device/back_inserter.hpp>
    #include <boost/iostreams/copy.hpp>
    
    static auto const flags = boost::archive::no_header | boost::archive::no_tracking;
    
    template <typename T>
    size_t archive_size(T const& v)
    {
        std::stringstream ss;
        {
            boost::archive::binary_oarchive oa(ss, flags);
            oa << v;
        }
    
        std::vector<char> compressed;
        {
            boost::iostreams::filtering_ostream fos;
            fos.push(boost::iostreams::bzip2_compressor());
            fos.push(boost::iostreams::back_inserter(compressed));
    
            boost::iostreams::copy(ss, fos);
        }
    
        return compressed.size();
    }
    
    int main() {
        std::vector<int> i(100);
        std::cout << "bzip2: " << archive_size(i) << "\n";
    }
    

    打印

    bzip2: 47
    

    这是〜11%的压缩率(如果删除存档标志,则为〜19%).

    That's a compression ratio of ~11% (or ~19% if you drop the archive flags).

    这篇关于增强二进制存档-减小大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆