用C将整数写入文件的最快方法 [英] fastest way to write integer to file in C

查看:251
本文介绍了用C将整数写入文件的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做C编程的功课.奖励积分是用于快速写入那里的上传测试系统中的文件.

I am doing a homework to programming in C. Bonus points are for quick writing to the file in there upload test system.

我正在尝试写很多行,每行包括三个以空格分隔的十进制整数字符串,然后在文件中包含"\ n".问题是fprintf太慢了(它们的参考时间快了大约1/3).

I am trying to write a number of lines each comprising three space delimited decimal integer strings and then '\n' in file. The problem is, that fprintf is too slow (their reference time is more or less 1/3 faster).

我尝试了很多可能性(一切都在for循环中). fprintf(太慢):

I have tried a lots of possibilities (everything is in one for loop). fprintf (too slow):

fprintf(f, "%d %d %d\n", a[i], b[i], c[i]);

转换为字符串,然后将字符串放入其中-甚至更糟:

converting to string and then put the string into it - even worse:

sprintf(buffer, "%d", a[i]); //or: _itoa(_itoa(a[i], buffer, 10);
fputs(buffer, f);
fputc(' ', f);

有什么快速方法可以将整数写到简单文本文件(.txt)中(最后一个解决方案的时间为220毫秒,参考时间为140毫秒,您可以查看时间)?我一直在努力搜索和搜索,但无济于事.但是,如果时间这么短,则必须采取某种方法!

is there any quick way to write integer numbers to simple text file (.txt) (the last solution has time 220ms, reference is 140ms for you to picture the time)? I have been trying and googling as hell, but nothing is working. But if the time is this short, there has to be some way!

PS:数字始终是整数,大小始终为4个字节,格式始终为:

PS: The numbers are integers all the time, size is 4 bytes, all the time in format:

a0 b0 c0
a1 b1 c1
a2 b2 c2
a3 b3 c3
etc...

更多信息:发送解决方案时,我仅发送两个文件:file.h和file.c.没有主要的东西...所以一切都在他们的优化.解决方案应该在命令/算法中(即使问题的描述是语句,fprintf太慢,我们应该尝试其他方法来加快速度).

More info: When I send the solution, I send only two files: file.h and file.c. No main etc... so everything is in their optimization. The solution should be in commands/algorithm (even in the description of the problem is statement, that fprintf is too slow and we should try something else to speed things up).

谢谢你的一切!

由于您需要整个代码,因此它是:

since you want the whole code, here it is:

void save(const str_t * const str, const char *name)
{
  FILE* f;
  int i;

  if(str->cnt == 0)
      return;

  f = fopen(name, "w");
  if(f == NULL)
      return;

  for(i = 0; i < str->cnt; i++)
  {
      fprintf(f, "%d %d %d\n", str->a[i], str->b[i], str->c[i]);
  }
  fclose(f);
}

推荐答案

您可以通过大块写入文件来减少文件I/O的开销,以减少单个写操作的次数.

You can reduce the overhead of file I/O by writing to the file in large blocks to reduce the number of individual write operations.

#define CHUNK_SIZE 4096
char file_buffer[CHUNK_SIZE + 64] ;    // 4Kb buffer, plus enough 
                                       // for at least one one line
int buffer_count = 0 ;
int i = 0 ;

while( i < cnt )
{
    buffer_count += sprintf( &file_buffer[buffer_count], "%d %d %d\n", a[i], b[i], c[i] ) ;
    i++ ;

    // if the chunk is big enough, write it.
    if( buffer_count >= CHUNK_SIZE )
    {
        fwrite( file_buffer, buffer_count, 1, f ) ;
        buffer_count = 0 ;
    }
}

// Write remainder
if( buffer_count > 0 )
{
    fwrite( file_buffer, buffer_count, 1, f ) ;    
}

在一次写入中精确地写入4096个字节(或其他功率的2次幂)可能会有一些优势,但这在很大程度上取决于文件系统,并且执行该操作的代码略有减少.更复杂:

There may be some advantage in writing exactly 4096 bytes (or some other power of two) in a single write, but that is largely file-system dependent and the code to do that becomes a little more complicated:

#define CHUNK_SIZE 4096
char file_buffer[CHUNK_SIZE + 64] ;
int buffer_count = 0 ;
int i = 0 ;

while( i < cnt )
{
    buffer_count += sprintf( &file_buffer[buffer_count], "%d %d %d\n", a[i], b[i], c[i] ) ;
    i++ ;

    // if the chunk is big enough, write it.
    if( buffer_count >= CHUNK_SIZE )
    {
        fwrite( file_buffer, CHUNK_SIZE, 1, f ) ;
        buffer_count -= CHUNK_SIZE ;
        memcpy( file_buffer, &file_buffer[CHUNK_SIZE], buffer_count ) ;
    }
}

// Write remainder
if( buffer_count > 0 )
{
    fwrite( file_buffer, 1, buffer_count, f ) ;    
}

您可能会尝试使用CHUNK_SIZE的不同值-较大的值可能是最佳选择,或者您会发现它的作用不大.我建议至少 512个字节.

You might experiment with different values for CHUNK_SIZE - larger may be optimal, or you may find that it makes little difference. I suggest at least 512 bytes.

测试结果:

在以下平台上使用VC ++ 2015:

Using VC++ 2015, on the following platform:

使用Seagate ST1000DM003 1TB 64MB缓存SATA 6.0Gb/s硬盘驱动器.

With a Seagate ST1000DM003 1TB 64MB Cache SATA 6.0Gb/s Hard Drive.

单个测试写入100000行的结果是非常可变的,正如您在运行多个共享同一硬盘的多个进程的台式机系统上所期望的那样,因此我分别运行了100次测试并选择了最短的时间结果(就像蜜蜂一样)在结果下方的代码中看到):

The results for a single test writing 100000 lines is very variable as you might expect on a desktop system running multiple processes sharing the same hard drive, so I ran the tests 100 times each and selected the minimum time result (as can bee seen in the code below the results):

使用默认的调试"构建设置(4K块):

line_by_line: 0.195000 seconds
block_write1: 0.154000 seconds
block_write2: 0.143000 seconds

使用默认的发布"构建设置(4K块):

line_by_line: 0.067000 seconds
block_write1: 0.037000 seconds
block_write2: 0.036000 seconds

优化对所有三个实现都有相似的影响,固定大小的块写入比粗糙"的块稍快一些.

Optimisation had a proportionally similar effect on all three implementations, the fixed size chunk write was marginally faster then the "ragged" chunk.

使用32K块时,性能仅稍高一些,固定版本和粗糙版本之间的差异可以忽略不计:

When 32K blocks were used the performance was only slightly higher and the difference between the fixed and ragged versions negligible:

使用默认的发布"构建设置(32K块):

block_write1: 0.036000 seconds
block_write2: 0.036000 seconds

使用512字节块与4K块没有明显的区别:

Using 512 byte blocks was not measurably differnt from 4K blocks:

使用默认的发布"构建设置(512字节块):

block_write1: 0.036000 seconds
block_write2: 0.037000 seconds

以上所有内容均为32位(x86)构建.构建64位代码(x64)产生了有趣的结果:

All the above were 32bit (x86) builds. Building 64 bit code (x64) yielded interesting results:

使用默认的发布"构建设置(4K块)-64位代码:

line_by_line: 0.049000 seconds
block_write1: 0.038000 seconds
block_write2: 0.032000 seconds

参差不齐的块稍慢一些(尽管可能在统计上不显着),固定块则比逐行写入快得多(但不足以使其比任何块写都快).

The ragged block was marginally slower (though perhaps not statistically significant), the fixed block was significantly faster as was the line-by-line write (but not enough to make it faster then any block write).

测试代码(4K块版本):

#include <stdio.h>
#include <string.h>
#include <time.h>


void line_by_line_write( int count )
{
  FILE* f = fopen("line_by_line_write.txt", "w");
  for( int i = 0; i < count; i++)
  {
      fprintf(f, "%d %d %d\n", 1234, 5678, 9012 ) ;
  }
  fclose(f);       
}

#define CHUNK_SIZE (4096)

void block_write1( int count )
{
  FILE* f = fopen("block_write1.txt", "w");
  char file_buffer[CHUNK_SIZE + 64];
  int buffer_count = 0;
  int i = 0;

  while( i < count )
  {
      buffer_count += sprintf( &file_buffer[buffer_count], "%d %d %d\n", 1234, 5678, 9012 );
      i++;

      // if the chunk is big enough, write it.
      if( buffer_count >= CHUNK_SIZE )
      {
          fwrite( file_buffer, buffer_count, 1, f );
          buffer_count = 0 ;
      }
  }

  // Write remainder
  if( buffer_count > 0 )
  {
      fwrite( file_buffer, 1, buffer_count, f );
  }
  fclose(f);       

}

void block_write2( int count )
{
  FILE* f = fopen("block_write2.txt", "w");
  char file_buffer[CHUNK_SIZE + 64];
  int buffer_count = 0;
  int i = 0;

  while( i < count )
  {
      buffer_count += sprintf( &file_buffer[buffer_count], "%d %d %d\n", 1234, 5678, 9012 );
      i++;

      // if the chunk is big enough, write it.
      if( buffer_count >= CHUNK_SIZE )
      {
          fwrite( file_buffer, CHUNK_SIZE, 1, f );
          buffer_count -= CHUNK_SIZE;
          memcpy( file_buffer, &file_buffer[CHUNK_SIZE], buffer_count );
      }
  }

  // Write remainder
  if( buffer_count > 0 )
  {
      fwrite( file_buffer, 1, buffer_count, f );
  }
  fclose(f);       

}

#define LINES 100000

int main( void )
{
    clock_t line_by_line_write_minimum = 9999 ;
    clock_t block_write1_minimum = 9999 ;
    clock_t block_write2_minimum = 9999 ;

    for( int i = 0; i < 100; i++ )
    {
        clock_t start = clock() ;
        line_by_line_write( LINES ) ;
        clock_t t = clock() - start ;
        if( t < line_by_line_write_minimum ) line_by_line_write_minimum = t ;

        start = clock() ;
        block_write1( LINES ) ;
        t = clock() - start ;
        if( t < block_write1_minimum ) block_write1_minimum = t ;

        start = clock() ;
        block_write2( LINES ) ;
        t = clock() - start ;
        if( t < block_write2_minimum ) block_write2_minimum = t ;
    }

    printf( "line_by_line: %f seconds\n", (float)(line_by_line_write_minimum) / CLOCKS_PER_SEC ) ;
    printf( "block_write1: %f seconds\n", (float)(block_write1_minimum) / CLOCKS_PER_SEC ) ;
    printf( "block_write2: %f seconds\n", (float)(block_write2_minimum) / CLOCKS_PER_SEC ) ;
}

这篇关于用C将整数写入文件的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆