尝试和真正的简单文件复制代码在C? [英] Tried and true simple file copying code in C?

查看:207
本文介绍了尝试和真正的简单文件复制代码在C?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这看起来像一个简单的问题,但我没有找到类似这里。

This looks like a simple question, but I didn't find anything similar here.

由于没有文件复制功能在C中,我们必须实现文件复制自己,但我不喜欢重复发明轮子即使对于这样的小事,所以我想问云:

Since there is no file copy function in C, we have to implement file copying ourselves, but I don't like reinventing the wheel even for trivial stuff like that, so I'd like to ask the cloud:


  1. 你会建议使用fopen()/ fread()/ fwrite()文件复制什么代码?


    • 您建议使用open()/ read()/ write()进行文件复制的代码?


$ b 此代码应该是可移植的(windows / mac / linux / bsd / qnx / younameit),稳定,内存高效等。进入特定的系统内部来挤压一些更多的性能是受欢迎的(例如获得文件系统集群大小)。

This code should be portable (windows/mac/linux/bsd/qnx/younameit), stable, time tested, fast, memory efficient and etc. Getting into specific system's internals to squeeze some more performance is welcomed (like getting filesystem cluster size).

这看起来像一个小问题, ,CP命令的源代码不是10行C代码。

This seems like a trivial question but, for example, source code for CP command isn't 10 lines of C code.

推荐答案

代码我写了一百万次在各种guises从一个流复制数据到另一个像这样。成功返回0,或者在错误时设置errno(在这种情况下可能已经复制了任意数量的字节)。

As far as the actual I/O goes, the code I've written a million times in various guises for copying data from one stream to another goes something like this. It returns 0 on success, or -1 with errno set on error (in which case any number of bytes might have been copied).

请注意,你可以跳过EAGAIN的东西,因为常规文件总是阻塞I / O。但是不可避免地,如果你写这个代码,有人会使用它在其他类型的文件描述符,所以考虑它是一个免费的。

Note that for copying regular files, you can skip the EAGAIN stuff, since regular files are always blocking I/O. But inevitably if you write this code, someone will use it on other types of file descriptors, so consider it a freebie.

有一个文件特定的优化,GNU cp ,我没有在这里,对于0字节的长块,而不是写你只是扩展输出文件通过寻找结束。

There's a file-specific optimisation that GNU cp does, which I haven't bothered with here, that for long blocks of 0 bytes instead of writing you just extend the output file by seeking off the end.

void block(int fd, int event) {
    pollfd topoll;
    topoll.fd = fd;
    topoll.events = event;
    poll(&topoll, 1, -1);
    // no need to check errors - if the stream is bust then the
    // next read/write will tell us
}

int copy_data_buffer(int fdin, int fdout, void *buf, size_t bufsize) {
    for(;;) {
       void *pos;
       // read data to buffer
       ssize_t bytestowrite = read(fdin, buf, bufsize);
       if (bytestowrite == 0) break; // end of input
       if (bytestowrite == -1) {
           if (errno == EINTR) continue; // signal handled
           if (errno == EAGAIN) {
               block(fdin, POLLIN);
               continue;
           }
           return -1; // error
       }

       // write data from buffer
       pos = buf;
       while (bytestowrite > 0) {
           ssize_t bytes_written = write(fdout, pos, bytestowrite);
           if (bytes_written == -1) {
               if (errno == EINTR) continue; // signal handled
               if (errno == EAGAIN) {
                   block(fdout, POLLOUT);
                   continue;
               }
               return -1; // error
           }
           bytestowrite -= bytes_written;
           pos += bytes_written;
       }
    }
    return 0; // success
}

// Default value. I think it will get close to maximum speed on most
// systems, short of using mmap etc. But porters / integrators
// might want to set it smaller, if the system is very memory
// constrained and they don't want this routine to starve
// concurrent ops of memory. And they might want to set it larger
// if I'm completely wrong and larger buffers improve performance.
// It's worth trying several MB at least once, although with huge
// allocations you have to watch for the linux 
// "crash on access instead of returning 0" behaviour for failed malloc.
#ifndef FILECOPY_BUFFER_SIZE
    #define FILECOPY_BUFFER_SIZE (64*1024)
#endif

int copy_data(int fdin, int fdout) {
    // optional exercise for reader: take the file size as a parameter,
    // and don't use a buffer any bigger than that. This prevents 
    // memory-hogging if FILECOPY_BUFFER_SIZE is very large and the file
    // is small.
    for (size_t bufsize = FILECOPY_BUFFER_SIZE; bufsize >= 256; bufsize /= 2) {
        void *buffer = malloc(bufsize);
        if (buffer != NULL) {
            int result = copy_data_buffer(fdin, fdout, buffer, bufsize);
            free(buffer);
            return result;
        }
    }
    // could use a stack buffer here instead of failing, if desired.
    // 128 bytes ought to fit on any stack worth having, but again
    // this could be made configurable.
    return -1; // errno is ENOMEM
}

要打开输入文件:

int fdin = open(infile, O_RDONLY|O_BINARY, 0);
if (fdin == -1) return -1;

打开输出文件很麻烦。作为基础,您需要:

Opening the output file is tricksy. As a basis, you want:

int fdout = open(outfile, O_WRONLY|O_BINARY|O_CREAT|O_TRUNC, 0x1ff);
if (fdout == -1) {
    close(fdin);
    return -1;
}

但有混杂因素:


  • 当文件相同时,您需要使用特殊情况,我不能记住如何移植。


  • 如果输出文件已存在(使用O_EXCL打开以确定此错误并检查EEXIST错误),您可以将该文件复制到目录中。可能想要做不同的事情,因为 cp -i 可以。

  • 您可能希望输出文件的权限反映那些

  • 您可能希望或可能不希望取消链接输出文件。

  • 您可能希望复制其他平台特定的元数据。文件错误。

  • you need to special-case when the files are the same, and I can't remember how to do that portably.
  • if the output filename is a directory, you might want to copy the file into the directory.
  • if the output file already exists (open with O_EXCL to determine this and check for EEXIST on error), you might want to do something different, as cp -i does.
  • you might want the permissions of the output file to reflect those of the input file.
  • you might want other platform-specific meta-data to be copied.
  • you may or may not wish to unlink the output file on error.

显然,所有这些问题的答案可以是 cp 。在这种情况下,原始问题的答案是忽略我或任何其他人所说的一切,并使用 cp 的来源。

Obviously the answers to all these questions could be "do the same as cp". In which case the answer to the original question is "ignore everything I or anyone else has said, and use the source of cp".

Btw,获得文件系统的集群大小是没有用的。在您通过磁盘块大小后,您几乎总是会看到缓冲区大小增加的速度。

Btw, getting the filesystem's cluster size is next to useless. You'll almost always see speed increasing with buffer size long after you've passed the size of a disk block.

这篇关于尝试和真正的简单文件复制代码在C?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆