尝试和使用C真正简单的文件复制code? [英] Tried and true simple file copying code in C?
问题描述
这看起来像一个简单的问题,但我没有在这里找到类似的事情。
This looks like a simple question, but I didn't find anything similar here.
由于在C中没有文件拷贝功能,我们必须实现文件复制自己,但我不喜欢重新发明轮子即使这样琐碎的东西,所以我想问一下云:
Since there is no file copy function in C, we have to implement file copying ourselves, but I don't like reinventing the wheel even for trivial stuff like that, so I'd like to ask the cloud:
- ç$ C,你会推荐什么$文件复制使用fopen()函数/ FREAD()/ fwrite()将?
- ç$ C,你会为文件的复制使用open(推荐什么$)/读取()/写()?
这code应该是可移植(Windows / Mac的/ Linux的/ BSD / QNX / younameit),稳定的,经过时间考验,速度快,内存使用效率等进入特定系统的内部挤一些性能受到欢迎(喜欢把自己的文件系统的簇大小)。
This code should be portable (windows/mac/linux/bsd/qnx/younameit), stable, time tested, fast, memory efficient and etc. Getting into specific system's internals to squeeze some more performance is welcomed (like getting filesystem cluster size).
这似乎是一个微不足道的问题,而是,例如,源$ C $ C CP的命令不是10 C $ C $行C。
This seems like a trivial question but, for example, source code for CP command isn't 10 lines of C code.
推荐答案
至于实际的I / O得好,code我已经写在各种名目一百万次从一个流将数据复制到另一个是这样的。它成功返回0,或-1,并将errno的错误设置(在这种情况下,任何字节数可能已被复制)。
As far as the actual I/O goes, the code I've written a million times in various guises for copying data from one stream to another goes something like this. It returns 0 on success, or -1 with errno set on error (in which case any number of bytes might have been copied).
请注意,复制常规文件,你可以跳过EAGAIN的东西,因为常规文件总是阻塞I / O。但不可避免的,如果你写这code,有人将其用于其他类型的文件描述符,所以认为这是一个免费的东西。
Note that for copying regular files, you can skip the EAGAIN stuff, since regular files are always blocking I/O. But inevitably if you write this code, someone will use it on other types of file descriptors, so consider it a freebie.
有一个特定的文件优化的GNU CP
呢,我还没有在这里困扰,这为0字节长的块,而不是写你只是延长寻求关闭最终输出文件。
There's a file-specific optimisation that GNU cp
does, which I haven't bothered with here, that for long blocks of 0 bytes instead of writing you just extend the output file by seeking off the end.
void block(int fd, int event) {
pollfd topoll;
topoll.fd = fd;
topoll.events = event;
poll(&topoll, 1, -1);
// no need to check errors - if the stream is bust then the
// next read/write will tell us
}
int copy_data_buffer(int fdin, int fdout, void *buf, size_t bufsize) {
for(;;) {
void *pos;
// read data to buffer
ssize_t bytestowrite = read(fdin, buf, bufsize);
if (bytestowrite == 0) break; // end of input
if (bytestowrite == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdin, POLLIN);
continue;
}
return -1; // error
}
// write data from buffer
pos = buf;
while (bytestowrite > 0) {
ssize_t bytes_written = write(fdout, pos, bytestowrite);
if (bytes_written == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdout, POLLOUT);
continue;
}
return -1; // error
}
bytestowrite -= bytes_written;
pos += bytes_written;
}
}
return 0; // success
}
// Default value. I think it will get close to maximum speed on most
// systems, short of using mmap etc. But porters / integrators
// might want to set it smaller, if the system is very memory
// constrained and they don't want this routine to starve
// concurrent ops of memory. And they might want to set it larger
// if I'm completely wrong and larger buffers improve performance.
// It's worth trying several MB at least once, although with huge
// allocations you have to watch for the linux
// "crash on access instead of returning 0" behaviour for failed malloc.
#ifndef FILECOPY_BUFFER_SIZE
#define FILECOPY_BUFFER_SIZE (64*1024)
#endif
int copy_data(int fdin, int fdout) {
// optional exercise for reader: take the file size as a parameter,
// and don't use a buffer any bigger than that. This prevents
// memory-hogging if FILECOPY_BUFFER_SIZE is very large and the file
// is small.
for (size_t bufsize = FILECOPY_BUFFER_SIZE; bufsize >= 256; bufsize /= 2) {
void *buffer = malloc(bufsize);
if (buffer != NULL) {
int result = copy_data_buffer(fdin, fdout, buffer, bufsize);
free(buffer);
return result;
}
}
// could use a stack buffer here instead of failing, if desired.
// 128 bytes ought to fit on any stack worth having, but again
// this could be made configurable.
return -1; // errno is ENOMEM
}
要打开输入文件:
int fdin = open(infile, O_RDONLY|O_BINARY, 0);
if (fdin == -1) return -1;
打开输出文件是调皮。以此为基础,你想要的:
Opening the output file is tricksy. As a basis, you want:
int fdout = open(outfile, O_WRONLY|O_BINARY|O_CREAT|O_TRUNC, 0x1ff);
if (fdout == -1) {
close(fdin);
return -1;
}
但也有混杂因素:
But there are confounding factors:
- 您需要特殊情况下,当文件是相同的,我不记得如何做到这一点可移植。
- 如果输出文件名是一个目录,则可能需要将文件复制到该目录。
- 如果输出文件已经存在(与O_EXCL开放,以确定这一点,并检查EEXIST误差),你可能想要做不同的事情,因为
CP -i
做 - 您可能希望在输出文件的权限,以反映这些输入文件。
- 您可能需要其他特定于平台的元数据进行复制。
- 您可能会或可能不希望取消链接出错输出文件。
显然的答案,这些问题可能是做同样的 CP
。在这种情况下,答案原来的问题是不顾一切我或其他人说,用源 CP
Obviously the answers to all these questions could be "do the same as cp
". In which case the answer to the original question is "ignore everything I or anyone else has said, and use the source of cp
".
顺便说一句,得到了文件系统的簇大小基本是没用的。你几乎总能看到速度的缓冲区大小增加你度过了一个磁盘块的大小很久以后。
Btw, getting the filesystem's cluster size is next to useless. You'll almost always see speed increasing with buffer size long after you've passed the size of a disk block.
这篇关于尝试和使用C真正简单的文件复制code?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!