如何在Go中按字符读取文件字符 [英] How to read a file character by character in Go
问题描述
我有一些我想分析的大型json文件,我想避免将所有数据一次加载到内存中。我想要一个函数/循环,可以每次返回一个字符。
我发现这个例子用于迭代字符串中的单词, ScanRunes 功能看起来像它一次可以返回一个字符。我也有bufio中的 ReadRune
函数,它们大部分都可以工作,但这感觉像是一个相当沉重的做法。
编辑
我比较了3种方法。全部使用循环从bufio.Reader或bufio.Scanner中提取内容。
- 使用<$ c读取循环中的符文$ c> .ReadRune 放在
bufio.Reader
上。检查调用错误到.ReadRune
。 - 从
bufio.Scanner
在扫描仪上调用.Split(bufio.ScanRunes)
之后。被称为.Scan
和.Bytes
,并检查.Scan
是否有错误。 - 相同作为#2,但使用 bufio.Scanner 而不是字节读取文本。相对= nofollow>
。文本
。我用strings.Join([]字符串)加入了一段字符串,而不是用
string([] runes)
加入一段符文。
每个人的10次运行时间为23 MB json文件是:
-
0.65 s
-
2.40 s
-
0.97 s
因此,看起来像 ReadRune
毕竟不算太坏。因为每个符文都在1次操作( .ReadRune
)中取代2( .Scan $ c $> c>和
.Bytes
)。 通过循环中的一个... 查看示例
I have some large json files I want to parse, and I want to avoid loading all of the data into memory at once. I'd like a function/loop that can return me each character one at a time.
I found this example for iterating over words in a string, and the ScanRunes function in the bufio package looks like it could return a character at a time. I also had the ReadRune
function from bufio mostly working, but that felt like a pretty heavy approach.
EDIT
I compared 3 approaches. All used a loop to pull content from either a bufio.Reader or a bufio.Scanner.
- Read runes in a loop using
.ReadRune
on abufio.Reader
. Checked for errors from the call to.ReadRune
. - Read bytes from a
bufio.Scanner
after calling.Split(bufio.ScanRunes)
on the scanner. Called.Scan
and.Bytes
on each iteration, checking.Scan
call for errors. - Same as #2 but read text from a
bufio.Scanner
instead of bytes using.Text
. Instead of joining a slice of runes withstring([]runes)
, I joined an slice of strings withstrings.Join([]strings, "")
to form the final blobs of text.
The timing for 10 runs of each on a 23 MB json file was:
0.65 s
2.40 s
0.97 s
So it looks like ReadRune
is not too bad after all. It also results in smaller less verbose call because each rune is fetched in 1 operation (.ReadRune
) instead of 2 (.Scan
and .Bytes
).
Just read each rune one by one in the loop... See example
这篇关于如何在Go中按字符读取文件字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!