从stdin C ++读取数百万个整数的最快方法? [英] Fastest way to read millions of integers from stdin C++?

查看:232
本文介绍了从stdin C ++读取数百万个整数的最快方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一个排序项目,现在已经到了一个主要瓶颈,正在读取数据.我的程序大约需要20秒才能使用cinstd::ios::sync_with_stdio(false);对从stdin中读取的100,000,000个整数进行排序,但是事实证明,其中的10秒正在读取要排序的数据.我们确实知道我们将要读取多少个整数(计数在我们需要排序的文件的顶部).

I am working on a sorting project and I've come to the point where a main bottleneck is reading in the data. It takes my program about 20 seconds to sort 100,000,000 integers read in from stdin using cin and std::ios::sync_with_stdio(false); but it turns out that 10 of those seconds is reading in the data to sort. We do know how many integers we will be reading in (the count is at the top of the file we need to sort).

如何使它更快?我知道这是有可能的,因为前一个学期的学生能够在3秒钟多的时间内完成排序计数(这基本上是纯粹的阅读时间).

How can I make this faster? I know it's possible because a student in a previous semester was able to do counting sort in a little over 3 seconds (and that's basically purely read time).

程序仅使用整数文件(如$ ./program < numstosort.txt

The program is just fed the contents of a file with integers separated by newlines like $ ./program < numstosort.txt

谢谢

以下是相关代码:

    std::ios::sync_with_stdio(false);
    int max;
    cin >> max;
    short num;
    short* a = new short[max];
    int n = 0;
    while(cin >> num) { 
        a[n] = num;
        n++;
    }

推荐答案

假定Linux/POSIX在商品硬件上运行,这将以最快的速度将数据存储到内存中.请注意,由于显然不允许使用编译器优化,因此C ++ IO并不是读取数据的最快方法.正如其他人指出的那样,如果不进行优化,C ++代码将无法以最快的速度运行.

This will get your data into memory about as fast as possible, assuming Linux/POSIX running on commodity hardware. Note that since you apparently aren't allowed to use compiler optimizations, C++ IO is not going to be the fastest way to read data. As others have noted, without optimizations the C++ code will not run anywhere near as fast as it can.

鉴于重定向文件已经以stdin/STDIN_FILENO的形式打开,请使用低级系统调用/C风格的IO.不需要 进行优化,因为它将尽可能快地运行:

Given that the redirected file is already open as stdin/STDIN_FILENO, use low-level system call/C-style IO. That won't need to be optimized, as it will run just about as fast as possible:

struct stat sb;
int rc = ::fstat( STDIN_FILENO, &sb );

// use C-style calloc() to get memory that's been
// set to zero as calloc() is often optimized to be
// faster than a new followed by a memset().
char *data = ::calloc( 1, sb.st_size + 1 );
size_t totalRead = 0UL;
while ( totalRead  < st.st_size )
{
    ssize_t bytesRead = ::read( STDIN_FILENO,
        data + totalRead, sb.st_size - totalRead );
    if ( bytesRead <= 0 )
    {
        break;
    }
    totalRead += bytesRead;
}

// data is now in memory - start processing it

该代码会将您的数据作为一个长C样式字符串读入内存.缺乏编译器优化一点也不重要,因为它们几乎都是裸机系统调用.

That code will read your data into memory as one long C-style string. And the lack of compiler optimizations won't matter one bit as it's all almost bare-metal system calls.

使用fstat()获取文件大小可以一次分配所有所需的内存-无需realloc()或在其周围复制数据.

Using fstat() to get the file size allows allocating all the needed memory at once - no realloc() or copying data around is necessary.

您将需要添加一些错误检查,并且代码的更强大版本将进行检查,以确保从fstat()返回的数据实际上是具有实际大小的常规文件,而不是无用的"猫"(例如cat filename | YourProgram),因为在这种情况下,fstat()调用不会返回有用的文件大小.您需要在调用后检查struct statsb.st_mode字段,以查看stdin流的真正含义:

You'll need to add some error checking, and a more robust version of the code would check to be sure the data returned from fstat() actually is a regular file with an actual size, and not a "useless use of cat" such as cat filename | YourProgram, because in that case the fstat() call won't return a useful file size. You'll need to examine the sb.st_mode field of the struct stat after the call to see what the stdin stream really is:

::fstat( STDIN_FILENO, &sb );
...
if ( S_ISREG( sb.st_mode ) )
{
    // regular file...
}

(对于真正高性能的系统,确保将要读取数据的内存页实际映射到进程地址空间中非常重要.如果数据到达的速度比内核的内存管理速度更快,则性能可能会停滞不前系统可以为数据转储到的页面创建虚拟到物理的映射.)

(And for really high-performance systems, it can be important to ensure that the memory pages you're reading data into are actually mapped in your process address space. Performance can really stall if data arrives faster than the kernel's memory management system can create virtual-to-physical mappings for the pages data is getting dumped into.)

要尽可能快地处理大型文件,您需要使用多线程,其中一个线程读取数据并提供一个或多个数据处理线程,这样您就可以在读取数据之前开始处理数据.

To handle a large file as fast as possible, you'd want to go multithreaded, with one thread reading data and feeding one or more data processing threads so you can start processing data before you're done reading it.

解析数据.

同样,阻止编译器优化可能会使C ++操作的开销比C风格的处理慢.基于该假设,一些简单可能会运行得更快.

Again, preventing compiler optimizations probably makes the overhead of C++ operations slower than C-style processing. Based on that assumption, something simple will probably run faster.

在未优化的二进制文件中,这可能会以更快的速度工作,假设数据是按上述方式读取的C样式字符串:

This would probably work a lot faster in a non-optimized binary, assuming the data is in a C-style string read in as above:

char *next;
long count = ::strtol( data, &next, 0 );
long *values = new long[ count ];

for ( long ii = 0; ii < count; ii++ )
{
    values[ ii ] = ::strtol( next, &next, 0 );
}

这也是非常脆弱的.它依靠 strtol()跳过领先的空白,这意味着如果数值之间除了空格以外还有其他任何东西,它将失败.它还依赖于正确的值的初始计数.再次-如果不正确,该代码将失败.而且因为它可以在检查错误之前替换next的值,所以如果它由于不良数据而偏离正常轨道,那将无可救药地丢失.

That is also very fragile. It relies on strtol() skipping over leading whitespace, meaning if there's anything other than whitespace between the numeric values it will fail. It also relies on the initial count of values being correct. Again - that code will fail if that's not true. And because it can replace the value of next before checking for errors, if it ever goes off the rails because of bad data it'll be hopelessly lost.

但是在不允许编译器优化的情况下应该尽可能快.

But it should be about as fast as possible without allowing compiler optimizations.

这就是不允许编译器优化的疯狂所在.您可以编写简单,健壮的C ++代码来执行所有处理,利用良好的优化编译器,并且运行的速度几乎可以与我发布的代码一样快-不会进行错误检查,如果喂入将以意外和未定义的方式严重失败意外数据.

That's what crazy about not allowing compiler optimizations. You can write simple, robust C++ code to do all your processing, make use of a good optimizing compiler, and probably run almost as fast as the code I posted - which has no error checking and will fail spectacularly in unexpected and undefined ways if fed unexpected data.

这篇关于从stdin C ++读取数百万个整数的最快方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆