保留文件字节最合适的向量类型是什么? [英] What is the most suitable type of vector to keep the bytes of a file?

查看:84
本文介绍了保留文件字节最合适的向量类型是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最适合保留文件字节的向量类型是什么?

What is the most suitable type of vector to keep the bytes of a file?

我正在考虑使用int类型,因为位"00000000"(1个字节)被解释为0!

I'm considering using the int type, because the bits "00000000" (1 byte) are interpreted to 0!

目标是将这些数据(字节)保存到文件中,以后再从该文件中检索.

The goal is to save this data (bytes) to a file and retrieve from this file later.

注意:文件包含空字节(位为"00000000")!

NOTE: The files contain null bytes ("00000000" in bits)!

我在这里迷路了.帮我! = D谢谢!

I'm a bit lost here. Help me! =D Thanks!

更新我:

要使用此功能读取文件,请执行以下操作:

To read the file I'm using this function:

char* readFileBytes(const char *name){
    std::ifstream fl(name);
    fl.seekg( 0, std::ios::end );
    size_t len = fl.tellg();
    char *ret = new char[len];
    fl.seekg(0, std::ios::beg);
    fl.read(ret, len);
    fl.close();
    return ret;
}

注意我::我需要找到一种方法来确保可以从文件中恢复"00000000"位!

NOTE I: I need to find a way to ensure that bits "00000000" can be recovered from the file!

注意II:,关于安全方式将这些位"00000000"保存到文件的任何建议吗?

NOTE II: Any suggestions for a safe way to save those bits "00000000" to a file?

注意III::使用char数组时,我无法为该类型转换位"00000000".

NOTE III: When using char array I had problems converting bits "00000000" for that type.

代码段:

int bit8Array[] = {0, 0, 0, 0, 0, 0, 0, 0};
char charByte = (bit8Array[7]     ) | 
                (bit8Array[6] << 1) | 
                (bit8Array[5] << 2) | 
                (bit8Array[4] << 3) | 
                (bit8Array[3] << 4) | 
                (bit8Array[2] << 5) | 
                (bit8Array[1] << 6) | 
                (bit8Array[0] << 7);


更新II:

遵循@chqrlie建议.

Following the @chqrlie recommendations.

#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <algorithm>
#include <random>
#include <cstring>
#include <iterator>

std::vector<unsigned char> readFileBytes(const char* filename)
{
    // Open the file.
    std::ifstream file(filename, std::ios::binary);

    // Stop eating new lines in binary mode!
    file.unsetf(std::ios::skipws);

    // Get its size
    std::streampos fileSize;

    file.seekg(0, std::ios::end);
    fileSize = file.tellg();
    file.seekg(0, std::ios::beg);

    // Reserve capacity.
    std::vector<unsigned char> unsignedCharVec;
    unsignedCharVec.reserve(fileSize);

    // Read the data.
    unsignedCharVec.insert(unsignedCharVec.begin(),
               std::istream_iterator<unsigned char>(file),
               std::istream_iterator<unsigned char>());

    return unsignedCharVec;
}

int main(){

    std::vector<unsigned char> unsignedCharVec;

    // txt file contents "xz"
    unsignedCharVec=readFileBytes("xz.txt");

    // Letters -> UTF8/HEX -> bits!
    // x -> 78 -> 0111 1000
    // z -> 7a -> 0111 1010

    for(unsigned char c : unsignedCharVec){
        printf("%c\n", c);
        for(int o=7; o >= 0; o--){
            printf("%i", ((c >> o) & 1));
        }
        printf("%s", "\n");
    }

    // Prints...
    // x
    // 01111000
    // z
    // 01111010

    return 0;
}


更新III:

这是我用来写入二进制文件的代码:

This is the code I am using using to write to a binary file:

void writeFileBytes(const char* filename, std::vector<unsigned char>& fileBytes){
    std::ofstream file(filename, std::ios::out|std::ios::binary);
    file.write(fileBytes.size() ? (char*)&fileBytes[0] : 0, 
               std::streamsize(fileBytes.size()));
}

writeFileBytes("xz.bin", fileBytesOutput);


更新四:

Futher阅读了有关 UPDATE III 的信息:

Futher read about UPDATE III:

c ++-保存" std :: vector" unsigned char>"到文件

结论:

对于"00000000"位(1个字节)的问题,绝对的解决方案是将存储文件字节的类型更改为std::vector<unsigned char>作为朋友的指导. std::vector<unsigned char>是通用类型(存在于所有环境中),并且可以接受任何八进制(与"UPDATE I"中的char *不同)!

Definitely the solution to the problem of the "00000000" bits (1 byte) was change the type that stores the bytes of the file to std::vector<unsigned char> as the guidance of friends. std::vector<unsigned char> is a universal type (exists in all environments) and will accept any octal (unlike char* in "UPDATE I")!

此外,从数组(字符)更改为向量(无符号字符)对于成功至关重要!使用vector时,我可以更安全,完全独立于数据内容来操作数据(在char数组中,我对此有问题).

In addition, changing from array (char) to vector (unsigned char) was crucial for success! With vector I manipulate my data more securely and completely independent of its content (in char array I have problems with this).

非常感谢!

推荐答案

您的代码中存在3个问题:

There are 3 problems in your code:

  • 您使用char类型并返回char *.但是,返回值不是正确的C字符串,因为您没有为'\0'终止符分配额外的字节,也没有为null终止它.

  • You use the char type and return a char *. Yet the return value is not a proper C string as you do not allocate an extra byte for the '\0' terminator nor null terminate it.

如果文件中可能包含空字节,则可能应该使用类型unsigned charuint8_t来明确表明数组不包含文本.

If the file may contain null bytes, you should probably use type unsigned char or uint8_t to make it explicit that the array does not contain text.

您不将数组大小返回给调用方.调用者无法得知数组的长度.您可能应该使用std::vector<uint8_t>std::vector<unsigned char>而不是使用new分配的数组.

You do not return the array size to the caller. The caller has no way to tell how long the array is. You should probably use a std::vector<uint8_t> or std::vector<unsigned char> instead of an array allocated with new.

这篇关于保留文件字节最合适的向量类型是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆