什么是习惯性的C ++ 17标准读取二进制文件的方法? [英] What is the idiomatic C++17 standard approach to reading binary files?

查看:148
本文介绍了什么是习惯性的C ++ 17标准读取二进制文件的方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通常我会使用C样式文件IO,但我正在尝试使用现代C ++方法,包括使用C ++ 17特定功能 std :: byte std :: filesystem

Normally I would just use C style file IO, but I'm trying a modern C++ approach, including using the C++17 specific features std::byte and std::filesystem.

将整个文件读入内存,传统方法:

Reading an entire file into memory, traditional method:

#include <stdio.h>
#include <stdlib.h>

char *readFileData(char *path)
{
    FILE *f;
    struct stat fs;
    char *buf;

    stat(path, &fs);
    buf = (char *)malloc(fs.st_size);

    f = fopen(path, "rb");
    fread(buf, fs.st_size, 1, f);
    fclose(f);

    return buf;
}

将整个文件读入内存,现代方法:

Reading an entire file into memory, modern approach:

#include <filesystem>
#include <fstream>
#include <string>
using namespace std;
using namespace std::filesystem;

auto readFileData(string path)
{
    auto fileSize = file_size(path);
    auto buf = make_unique<byte[]>(fileSize);
    basic_ifstream<byte> ifs(path, ios::binary);
    ifs.read(buf.get(), fileSize);
    return buf;
}

这看起来对吗?这可以改进吗?

Does this look about right? Can this be improved?

推荐答案

我个人更喜欢 std :: vector< std :: byte> 使用 std :: string 除非您正在阅读实际的文本文档。 make_unique< byte []>(fileSize); 的问题是您立即丢失了数据的大小,并且必须在单独的变量中携带它。它可能比 std :: vector< std :: byte> 快一小部分,因为它不会零初始化。但我认为读取磁盘所花费的时间可能总是黯然失色。

Personally I prefer std::vector<std::byte>to using std::string unless you are reading an actual text document. The problem with make_unique<byte[]>(fileSize); is that you instantly lose the size of the data and have to carry it in a separate variable. It may be a tiny fraction faster than a std::vector<std::byte> given that it won't zero initialize. But I think that will probably always be overshadowed by the time taken reading off the disk.

所以对于二进制文件,我使用类似这样的东西:

So for a binary file I use something like this:

std::vector<std::byte> load_file(std::string const& filepath)
{
    std::ifstream ifs(filepath, std::ios::binary|std::ios::ate);

    if(!ifs)
        throw std::runtime_error(filepath + ": " + std::strerror(errno));

    auto end = ifs.tellg();
    ifs.seekg(0, std::ios::beg);

    auto size = std::size_t(end - ifs.tellg());

    if(size == 0) // avoid undefined behavior 
        return {}; 

    std::vector<std::byte> buffer(size);

    if(!ifs.read((char*)buffer.data(), buffer.size()))
        throw std::runtime_error(filepath + ": " + std::strerror(errno));

    return buffer;
}

这是我所知道的最快的方法。它还避免了确定文件中数据大小的常见错误 ifs.tellg()在结束时打开文件并且<$后不一定是正确的值c $ c> ifs.seekg(0)理论上不是找到文件开头的正确方法(即使它在大多数地方都有效)。

This is the fastest method I know of. It also avoids a common mistake in determining the size of the data in the file ifs.tellg() is not necessarily the correct value after opening the file at the end and ifs.seekg(0) is not theoretically the correct way to locate the start of the file(even though it works in practice most places).

来自 std :: strerror(errno)的错误消息保证可以在 POSIX 系统上运行(应该包括Microsoft但不确定)。

The error message from std::strerror(errno) is guaranteed to work on POSIX systems (that should include Microsoft but not sure).

显然你可以使用 std :: filesystem :: path const&文件路径代替 std :: string 如果需要。

Obviously you can use std::filesystem::path const& filepath in place of std::string if you want.

此外,特别是对于前 C ++ 17 ,您可以使用 std :: vector< ; unsigned char> std :: vector< unsigned char> 如果您没有或想要使用 std :: byte

Also, especially for pre C++17 you can use std::vector<unsigned char> or std::vector<unsigned char> if you don't have or want to use std::byte.

这篇关于什么是习惯性的C ++ 17标准读取二进制文件的方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆