C ++中整数向量的序列化/反序列化 [英] Serialization/Deserialization of a Vector of Integers in C++
问题描述
我正在尝试将整数向量序列化为字符串,以便可以将其存储到文件中.所使用的方法是将整数逐字节复制到缓冲区中.为此,我使用了std :: copy_n函数.
I'm trying to serialize a vector of integers into a string so that it can be stored into a file. The approach used is to copy the integers byte-by-byte into a buffer. For this I used the std::copy_n function.
要反序列化,我做了相反的事情,即从缓冲区中逐字节复制到一个整数,并将这些整数附加到向量上.
To deserialize, I've done the same thing in reverse i.e. copied byte-by-byte into an integer from the buffer and appended those integers to a vector.
我不确定这是否是实现此目标的最佳/最快方法.
I'm not sure if this is the best/fastest way to achieve this.
char *serialize(vector <int> nums)
{
char *buffer = (char *)malloc(sizeof(int)*nums.size());
vector <int>::iterator i;
int j;
for(i = nums.begin(), j = 0; i != nums.end(); i++, j += 4) {
copy_n(i, 4, buffer+j);
}
return buffer;
}
反序列化功能
vector <int> deserialize(char *str, int len)
{
int num;
vector <int> ret;
for(int j = 0; j < len; j+=4) {
copy_n(str+j, 4, &num);
ret.push_back(num);
}
return ret;
}
任何有关如何改进这段代码的输入都将非常有帮助.我也很想知道其他实现相同目标的方法.
Any inputs on how I can improve this bit of code would be really helpful. I would also love to know other approaches to achieve the same.
推荐答案
您的方法有很多问题.
char *serialize(vector <int> nums)
{
char *buffer = (char *)malloc(sizeof(int)*nums.size());
vector <int>::iterator i;
int j;
for(i = nums.begin(), j = 0; i != nums.end(); i++, j += 4) {
copy_n(i, 4, buffer+j);
}
return buffer;
}
1)它手动分配内存,这很危险,很少需要.
1) It allocates memory manually, which is dangerous and rarely necessary.
2)并没有达到您的预期.它从字面上复制每个int
并尝试将其填充到char
中.因此,如果任何值大于255
(可填充到char
的最大数量),则数据将被破坏.
2) It doesn't do what you think it does. It literally copies each int
and tries to stuff it into a char
. So the data is getting corrupted if any of the values are above 255
(the maximum number stuffable into a char
).
如果您正在寻找效率,那么我认为最好的方法是将数据直接写入输出流,而不是先将其转换为字符串.
If you are looking for efficiency then I would think the best way would be to write the data directly to the output stream rather than converting it to a string first.
请记住,像这样不可移植来写出二进制数据.我只会用它来序列化/反序列化本地数据.理想情况下是一次会议.除此之外,您还必须开始考虑使每个输出数据可移植,并且变得更加复杂.我个人将完全避免使用二进制方法,除非绝对必要.
Bear in mind, writing out binary data like this is not portable. I would only use this for serializing/deserializing local data. Ideally from a single session. Beyond that you have to start thinking about making each output data portable and it gets more complicated. Personally I would avoid the binary approach altogether unless absolutely necessary.
如果必须这样做,我可能会做更多类似的事情:
If you must do it, I would probably do something more like this:
template<typename POD>
std::ostream& serialize(std::ostream& os, std::vector<POD> const& v)
{
// this only works on built in data types (PODs)
static_assert(std::is_trivial<POD>::value && std::is_standard_layout<POD>::value,
"Can only serialize POD types with this function");
auto size = v.size();
os.write(reinterpret_cast<char const*>(&size), sizeof(size));
os.write(reinterpret_cast<char const*>(v.data()), v.size() * sizeof(POD));
return os;
}
template<typename POD>
std::istream& deserialize(std::istream& is, std::vector<POD>& v)
{
static_assert(std::is_trivial<POD>::value && std::is_standard_layout<POD>::value,
"Can only deserialize POD types with this function");
decltype(v.size()) size;
is.read(reinterpret_cast<char*>(&size), sizeof(size));
v.resize(size);
is.read(reinterpret_cast<char*>(v.data()), v.size() * sizeof(POD));
return is;
}
这些功能的接口遵循标准库中的约定,并且足够灵活,您可以使用它来序列化为文件(使用std::fstream
)或字符串(使用std::stringstream
).
The interface to these functions follows the convention set in the Standard Library and it flexible enough that you can use it to serialize to files (using std::fstream
) or strings (using std::stringstream
).
std::vector<int> v = {1, 2, 3, 500, 900};
std::stringstream oss; // this could just as well be a `std::fstream`
if(serialize(oss, v))
{
std::vector<int> n;
if(deserialize(oss, n))
{
for(auto i: n)
std::cout << i << '\n';
}
}
输出:
1
2
3
500
900
这篇关于C ++中整数向量的序列化/反序列化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!