使用 mmap 读取文件到字符串 [英] Reading a file to string with mmap

查看:32
本文介绍了使用 mmap 读取文件到字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 mmap 将文件读取为字符串.

I'm trying to read a file to a string using mmap.

我在关注这个例子:http://www.lemoda.net/c/mmap-example/index.html

我的代码是这样的

unsigned char *f;
int size;
int main(int argc, char const *argv[])
{
    struct stat s;
    const char * file_name = argv[1];
    int fd = open (argv[1], O_RDONLY);

    /* Get the size of the file. */
    int status = fstat (fd, & s);
    size = s.st_size;

    f = (char *) mmap (0, size, PROT_READ, 0, fd, 0);
    for (i = 0; i < size; i++) {
        char c;

        c = f[i];
        putchar(c);
    }

    return 0;
}

但我在访问 f[i] 时总是收到分段错误.我做错了什么?

But I always receive a segemation fault when accessing f[i]. What am I doing wrong?

推荐答案

strace是你的朋友:

$ strace ./mmap-example mmap-example.c

...
... (lots of output)
...
open("mmap-example.c", O_RDONLY)        = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=582, ...}) = 0
mmap(NULL, 582, PROT_READ, MAP_FILE, 3, 0) = -1 EINVAL (Invalid argument)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

mmap 手册页告诉你所有你需要知道的;)

The mmap man page tells you all you need to know ;)

  • EINVAL 我们不喜欢 addrlengthoffset(例如,它们太大,或未在页面边界上对齐).
  • EINVAL(自 Linux 2.6.12 起)length 为 0.
  • EINVAL flags 既不包含 MAP_PRIVATE 也不包含 MAP_SHARED,或
    包含这两个值.
  • EINVAL We don't like addr, length, or offset (e.g., they are too large, or not aligned on a page boundary).
  • EINVAL (since Linux 2.6.12) length was 0.
  • EINVAL flags contained neither MAP_PRIVATE or MAP_SHARED, or
    contained both of these values.

-EINVAL 错误是由不能为 0 的标志引起的.必须选择 MAP_PRIVATEMAP_SHARED.我已经能够通过在 Linux x86-64 上使用 MAP_PRIVATE 使其工作.

The -EINVAL error is caused by flags that cannot be 0. Either MAP_PRIVATE or MAP_SHARED has to be picked. I have been able to make it work by using MAP_PRIVATE on Linux, x86-64.

因此,您只需将 MAP_PRIVATE 添加到 mmap():

So, you have just to add MAP_PRIVATE to mmap():

#include <stdio.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <sys/io.h>
#include <sys/mman.h>

int main(int argc, char const *argv[])
{
    unsigned char *f;
    int size;
    struct stat s;
    const char * file_name = argv[1];
    int fd = open (argv[1], O_RDONLY);

    /* Get the size of the file. */
    int status = fstat (fd, & s);
    size = s.st_size;

    f = (char *) mmap (0, size, PROT_READ, MAP_PRIVATE, fd, 0);
    for (int i = 0; i < size; i++) {
        char c;

        c = f[i];
        putchar(c);
    }

    return 0;
}

<小时>

注意:我的第一个答案确实包含 EINVAL 的另一个可能原因:

size 必须是系统页面大小的整数倍.到使用函数getpagesize()获取页面大小.

size must be an integral multiple of the page size of the system. To obtain the page size use the function getpagesize().

这实际上不是必需的,但是您必须考虑到,无论哪种方式,映射都将始终以系统页面大小的倍数执行,因此如果您想计算多少内存实际上可以通过返回的指针获得,更新 size 如下:

This is not actually required, but you must take into account that either way, mapping will be always performed in multiples of the system page size, so if you'd like to calculate how much memory is actually been available through the returned pointer, update size as this:

int pagesize = getpagesize();
size = s.st_size;
size += pagesize-(size%pagesize);

这篇关于使用 mmap 读取文件到字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆