提升二进制档案 - 减少大小 [英] Boost binary archives - reducing size

查看：22 发布时间：2021/11/17 3:13:58 c++ serialization boost archive

本文介绍了提升二进制档案 - 减少大小的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试减少 C++ 中 boost 存档的内存大小.

I am trying to reduce the memory size of boost archives in C++.

我发现的一个问题是 Boost 的二进制档案默认使用 4 个字节来表示任何 int，无论其大小如何.出于这个原因，我发现一个空的 boost 二进制存档需要 62 个字节，而一个空的文本存档需要 40 个(空文本存档的文本表示:22 serialization::archive 14 0 0 1 0 0 0 0 0).


One problem I have found is that Boost's binary archives default to using 4 bytes for any int, regardless of its magnitude. For this reason, I am getting that an empty boost binary archive takes 62 bytes while an empty text archive takes 40 (text representation of an empty text archive: 22 serialization::archive 14 0 0 1 0 0 0 0 0).  
有什么办法可以改变整数的这种默认行为?
Is there any way to change this default behavior for ints? 
除此之外，除了使用 make_array 向量之外，还有其他方法可以优化二进制存档的大小吗?
Else, are there any other ways to optimize the size of a binary archive apart from using make_array for vectors? 
推荐答案


问. 我正在尝试减少 C++ 中 boost 存档的内存大小. 
请参阅Boost C++ 序列化开销 
Q. 我发现的一个问题是 Boost 的二进制存档默认使用 4 个字节来表示任何整数，无论其大小如何. 
那是因为它是一个序列化库，而不是一个压缩库
That's because it's a serialization library, not a compression library
Q. 出于这个原因，我得到一个空的 boost 二进制存档需要 62 个字节，而一个空的文本存档需要 40 个(空文本存档的文本表示:22 序列化::archive 14 0 0 1 0 0 0 0 0).
使用存档标志:例如来自 提升序列化:如何预测序列化结果的大小?:
Use the archive flags: e.g. from Boost Serialization : How To Predict The Size Of The Serialized Result?:
调整内容(boost::archive::no_codecvt、boost::archive::no_header、禁用跟踪 等)

问. 有没有办法改变整数的这种默认行为?
没有.不过有 BOOST_IS_BITWISE_SERIALIZABLE(T)(参见例如提高序列化按位可串行化 示例和解释).
No. There is BOOST_IS_BITWISE_SERIALIZABLE(T) though (see e.g. Boost serialization bitwise serializability for an example and explanations).
问. 除此之外，除了使用 make_array 向量之外，还有其他方法可以优化二进制存档的大小吗?>

  Q. Else, are there any other ways to optimize the size of a binary archive apart from using make_array for vectors?
使用 make_array 对 vector 没有帮助:
Using make_array doesn't help for vector<int>: 
生活在 Coliru

#include <boost/archive/binary_oarchive.hpp>
#include <boost/serialization/vector.hpp>
#include <sstream>
#include <iostream>

static auto const flags = boost::archive::no_header | boost::archive::no_tracking;

template <typename T>
std::string direct(T const& v) {
    std::ostringstream oss;
    {
        boost::archive::binary_oarchive oa(oss, flags);
        oa << v;
    }
    return oss.str();
}

template <typename T>
std::string as_pod_array(T const& v) {
    std::ostringstream oss;
    {
        boost::archive::binary_oarchive oa(oss, flags);
        oa << v.size() << boost::serialization::make_array(v.data(), v.size());
    }
    return oss.str();
}

int main() {
    std::vector<int> i(100);
    std::cout << "direct: "       << direct(i).size() << "\n";
    std::cout << "as_pod_array: " << as_pod_array(i).size() << "\n";
}

印刷品
direct: 408
as_pod_array: 408


压缩
最直接的优化方法是压缩结果流(另见添加的基准这里).
除此之外，您将不得不覆盖默认序列化并应用您自己的压缩(可以是简单的游程编码、霍夫曼编码或更特定于域的编码).
Barring that, you will have to override default serialization and apply your own compression (which could be a simple run-length encoding, huffman coding or something more domain specific).
生活在 Coliru

#include <boost/archive/binary_oarchive.hpp>
#include <boost/serialization/vector.hpp>
#include <sstream>
#include <iostream>
#include <boost/iostreams/filter/bzip2.hpp>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/device/back_inserter.hpp>
#include <boost/iostreams/copy.hpp>

static auto const flags = boost::archive::no_header | boost::archive::no_tracking;

template <typename T>
size_t archive_size(T const& v)
{
    std::stringstream ss;
    {
        boost::archive::binary_oarchive oa(ss, flags);
        oa << v;
    }

    std::vector<char> compressed;
    {
        boost::iostreams::filtering_ostream fos;
        fos.push(boost::iostreams::bzip2_compressor());
        fos.push(boost::iostreams::back_inserter(compressed));

        boost::iostreams::copy(ss, fos);
    }

    return compressed.size();
}

int main() {
    std::vector<int> i(100);
    std::cout << "bzip2: " << archive_size(i) << "\n";
}

印刷品
bzip2: 47

压缩率约为 11%(如果删除存档标志，则压缩率约为 19%).
That's a compression ratio of ~11% (or ~19% if you drop the archive flags).

                        这篇关于提升二进制档案 - 减少大小的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

提升二进制档案 - 减少大小 [英] Boost binary archives - reducing size

问题描述

推荐答案

压缩

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

提升二进制档案 - 减少大小 [英] Boost binary archives - reducing size

问题描述

推荐答案

压缩

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭