golang:如何有效地确定文件中的行数? [英] golang: How do I determine the number of lines in a file efficiently?

查看:132
本文介绍了golang:如何有效地确定文件中的行数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



当然,我总是可以遍历整个文件,但看起来效率不高。
$ b $ pre $ file,_:= os.Open(/ path / to / filename)
fileScanner:= bufio.NewScanner(文件)
lineCount:= 0
for fileScanner.Scan(){
lineCount ++
}
fmt.Println(行数:,lineCount)

是否有更好的(更快,更便宜)一个文件有多少行?

解决方案

这是一个使用 bytes.Count 来查找换行符。



它的速度更快,因为它带走了返回整行所需的所有额外逻辑和缓冲区,并利用某些组合由字节包提供的优化功能来搜索字节片段中的字符。

更大的缓冲区在这里也有帮助,espec更大的文件。在我的系统中,使用我用于测试的文件,32k缓冲区是最快的。

  func lineCounter(r io.Reader) (int,error){
buf:= make([] byte,32 * 1024)
count:= 0
lineSep:= [] byte {'\\\
'}
$ bc,err:= r.Read(buf)
count + = bytes.Count(buf [:c],lineSep)

switch {
case err == io.EOF:
return count,nil

case err!= nil:
return count,err
}


$ / code>

和基准输出:

  BenchmarkBuffioScan 500 6408963 ns / op 4208 B / op 2 allocs / op 
BenchmarkBytesCount 500 4323397 ns / op 8200 B / op 1 allocs / op
BenchmarkBytes32k 500 3650818 ns / op 65545 B / op 1 allocs / op


In Golang, I am looking for an efficient way to determine the number of lines a file has.

Of course, I can always loop through the entire file, but does not seem very efficient.

file, _ := os.Open("/path/to/filename")
fileScanner := bufio.NewScanner(file)
lineCount := 0
for fileScanner.Scan() {
    lineCount++
}
fmt.Println("number of lines:", lineCount)

Is there a better (quicker, less expensive) way to find out how many lines a file has?

解决方案

Here's a faster line counter using bytes.Count to find the newline characters.

It's faster because it takes away all the extra logic and buffering required to return whole lines, and takes advantage of some assembly optimized functions offered by the bytes package to search characters in a byte slice.

Larger buffers also help here, especially with larger files. On my system, with the file I used for testing, a 32k buffer was fastest.

func lineCounter(r io.Reader) (int, error) {
    buf := make([]byte, 32*1024)
    count := 0
    lineSep := []byte{'\n'}

    for {
        c, err := r.Read(buf)
        count += bytes.Count(buf[:c], lineSep)

        switch {
        case err == io.EOF:
            return count, nil

        case err != nil:
            return count, err
        }
    }
}

and the benchmark output:

BenchmarkBuffioScan   500      6408963 ns/op     4208 B/op    2 allocs/op
BenchmarkBytesCount   500      4323397 ns/op     8200 B/op    1 allocs/op
BenchmarkBytes32k     500      3650818 ns/op     65545 B/op   1 allocs/op

这篇关于golang:如何有效地确定文件中的行数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆