C ++关闭使用mmap读取的open()文件 [英] c++ close a open() file read with mmap

查看:482
本文介绍了C ++关闭使用mmap读取的open()文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用mmap()快速读取大文件,我的脚本基于此问题的答案(

I am working with mmap() to fastly read big files, basing my script on this question answer (Fast textfile reading in c++).

我正在使用第二个答案:

I am using the second version from sehe answer :

#include <algorithm>
#include <iostream>
#include <cstring>

// for mmap:
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>

const char* map_file(const char* fname, size_t& length);

int main()
{
    size_t length;
    auto f = map_file("test.cpp", length);
    auto l = f + length;

    uintmax_t m_numLines = 0;
    while (f && f!=l)
        if ((f = static_cast<const char*>(memchr(f, n, l-f))))
            m_numLines++, f++;

    std::cout << "m_numLines = " << m_numLines << "n";
}

void handle_error(const char* msg) {
    perror(msg);
    exit(255);
}

const char* map_file(const char* fname, size_t& length)
{
    int fd = open(fname, O_RDONLY);
    if (fd == -1)
        handle_error("open");

    // obtain file size
    struct stat sb;
    if (fstat(fd, &sb) == -1)
        handle_error("fstat");

    length = sb.st_size;

    const char* addr = static_cast<const char*>(mmap(NULL, length, PROT_READ, MAP_PRIVATE, fd, 0u));
    if (addr == MAP_FAILED)
        handle_error("mmap");

    // TODO close fd at some point in time, call munmap(...)
    return addr;
}

,效果很好.

但是,如果我通过几个文件的循环来实现它(我只是将main()函数名称更改为:

But if I implement it over a loop of several files (I just change the main() function name to:

void readFile(std::string &nomeFile) {

,然后使用以下命令在main()函数的"f"对象中获取文件内容:

and then get the file content in "f" object in main() function with:

size_t length;
auto f = map_file(nomeFile.c_str(), length);
auto l = f + length;

,并在一段时间后从main()在文件名列表上循环调用它).

and call it from main() on a loop over a filenames list), after a while I got:

open: Too many open files

我想在处理文件后会有一种关闭open()调用的方法,但是我无法弄清楚如何以及在哪里正确放置它.我试过了:

I imagine there would be a way to close the open() call after working on a file, but I can not figure out how and where to put it exactly. I tried:

int fc = close(fd);

在readFile()函数的末尾,但它并没有改变.

at the end of the readFile() function but it did change nothing.

非常感谢您的帮助!

编辑:

收到重要建议后,我使用mmap()和std :: cin()对不同方法进行了性能比较,请查看:

after the important suggestions I received I made some performance comparison with different approaches with mmap() and std::cin(), check out: fast file reading in C++, comparison of different strategies with mmap() and std::cin() results interpretation for the results

推荐答案

限制为同时打开的文件数

您可以想象,保持文件打开会消耗资源.因此,在任何情况下,系统上打开文件描述符的数量都存在实际限制.这就是为什么强烈建议您关闭不再需要的文件的原因.

Limit to the number of concurrently open files

As you can imagine, keeping a file open consumes resources. So there is in any case a practical limit to the number of open file descriptors on your system. This is why it's highly recommended to close files that you no longer need.

确切的限制取决于操作系统和配置.如果您想了解更多,这种问题已经有很多答案了.

The exact limit depends on the OS and the configuration. If you want to know more, there are already a lot of answers available for this kind of question.

很明显,使用 mmap() 可以打开文件.如此反复反复进行,有可能使您早晚遇到致命文件描述限制,这有可能会遇到危险.

Obviously, with mmap() you open a file. And doing so repetitively in a loop risk to reach sooner or later the fatal file description limit, as you could experience.

尝试关闭文件的想法还不错.问题是它不起作用.这是在 POSIX文档:

The idea of trying to close the file is not bad. The problem is that it does not work. This is specified in the POSIX documentation:

mmap()函数为关联的文件添加了额外的引用 带有文件描述符fildes的文件,该文件描述符不会被后续文件删除 该文件描述符上的close().此引用在以下位置被删除 不再有文件映射.

The mmap() function adds an extra reference to the file associated with the file descriptor fildes which is not removed by a subsequent close() on that file descriptor. This reference is removed when there are no more mappings to the file.

为什么?因为mmap()以特殊方式将文件链接到系统中的虚拟内存管理.只要使用文件分配的地址范围,就需要该文件.

Why ? Because mmap() links the file in a special way to the virtual memory management in your system. And this file will be needed as long as you use the address range to which it was allocated.

那么如何删除这些映射?答案是使用 munmap() :

So how to remove those mappings ? The answer is to use munmap():

函数munmap()删除整个页面的所有映射 包含进程地址空间的任何部分,从 addr并继续len个字节.

The function munmap() removes any mappings for those entire pages containing any part of the address space of the process starting at addr and continuing for len bytes.

当然,close()您不再需要的文件描述符.谨慎的方法是在munmap()之后关闭,但原则上,至少在符合POSIX的系统上,正在关闭.不过,为了安全起见,请查看您最新的操作系统文档:-)

And of course, close() the file descriptor that you no longer need. A prudent approach would be to close after munmap(), but in principle, at least on a POSIX compliant system, it should not matter when you're closing. Nevertheless, check your latest OS documentation to be on the safe side :-)

*注意::文件映射在Windows上也可用; 有关关闭句柄的文档对于潜在的内存泄漏是模棱两可的.这就是为什么我建议您谨慎对待结束时刻. *

*Note: file mapping is also available on windows; the documentation about closing the handles is ambiguous on potential memory leaks if there are remaining mappings. This is why I recommend prudence on the closing moment. *

这篇关于C ++关闭使用mmap读取的open()文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆