如何快速计算行? [英] How to count lines fast?

查看:144
本文介绍了如何快速计算行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试过 unxutils WC -l <​​/ code>,但它坠毁了1GB的文件。我想这C#code

I tried unxutils' wc -l but it crashed for 1GB files. I tried this C# code

long count = 0;
using (StreamReader r = new StreamReader(f))
{
    string line;
    while ((line = r.ReadLine()) != null)
    {
        count++;
    }
}

return count;

它在4秒内读取500MB的文件

It reads a 500MB file in 4 seconds

var size = 256;
var bytes = new byte[size];
var count = 0;
byte query = Convert.ToByte('\n');
using (var stream = File.OpenRead(file))
{
    int many;
    do
    {
        many = stream.Read(bytes, 0, size);
        count += bytes.Where(a => a == query).Count();                    
    } while (many == size);
}

读取在10秒

var count = 0;
int query = (int)Convert.ToByte('\n');
using (var stream = File.OpenRead(file))
{
    int current;
    do
    {
        current = stream.ReadByte();
        if (current == query)
        {
            count++;
            continue;
        }
    } while (current!= -1);
}

需要7秒。

是什么,我快还没有尝试过?

Is anything faster I haven't tried yet?

推荐答案

您第一种方法看起来像最佳的解决方案了。请记住,你大多没有CPU限制,而是由高清的读取速度,这在500MB / 4秒= 125MB / s的已经是相当快的限制。获得比更快的唯一方法是通过RAID或使用固态硬盘,与其说是通过更好的算法。

Your first approach does look like the optimal solution already. Keep in mind that you're mostly not CPU bound but limited by the HD's read speed, which at 500MB / 4sec = 125MB/s is already quite fast. The only way to get faster than that is via RAID or using SSDs, not so much via a better algorithm.

这篇关于如何快速计算行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆