如何在Go中按字符读取文件字符 [英] How to read a file character by character in Go

查看:351
本文介绍了如何在Go中按字符读取文件字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些我想分析的大型json文件,我想避免将所有数据一次加载到内存中。我想要一个函数/循环,可以每次返回一个字符。



我发现这个例子用于迭代字符串中的单词, ScanRunes 功能看起来像它一次可以返回一个字符。我也有bufio中的 ReadRune 函数,它们大部分都可以工作,但这感觉像是一个相当沉重的做法。



编辑



我比较了3种方法。全部使用循环从bufio.Reader或bufio.Scanner中提取内容。


  1. 使用<$ c读取循环中的符文$ c> .ReadRune 放在 bufio.Reader 上。检查调用错误到 .ReadRune

  2. bufio.Scanner 在扫描仪上调用 .Split(bufio.ScanRunes)之后。被称为 .Scan .Bytes ,并检查 .Scan 是否有错误。

  3. 相同作为#2,但使用 bufio.Scanner 而不是字节读取文本。相对= nofollow> 。文本 。我用 strings.Join([]字符串)加入了一段字符串,而不是用 string([] runes)加入一段符文。

每个人的10次运行时间为23 MB json文件是:


  1. 0.65 s

  2. 2.40 s

  3. 0.97 s

因此,看起来像 ReadRune 毕竟不算太坏。因为每个符文都在1次操作( .ReadRune )中取代2( .Scan c>和 .Bytes )。 通过循环中的一个... 查看示例


I have some large json files I want to parse, and I want to avoid loading all of the data into memory at once. I'd like a function/loop that can return me each character one at a time.

I found this example for iterating over words in a string, and the ScanRunes function in the bufio package looks like it could return a character at a time. I also had the ReadRune function from bufio mostly working, but that felt like a pretty heavy approach.

EDIT

I compared 3 approaches. All used a loop to pull content from either a bufio.Reader or a bufio.Scanner.

  1. Read runes in a loop using .ReadRune on a bufio.Reader. Checked for errors from the call to .ReadRune.
  2. Read bytes from a bufio.Scanner after calling .Split(bufio.ScanRunes) on the scanner. Called .Scan and .Bytes on each iteration, checking .Scan call for errors.
  3. Same as #2 but read text from a bufio.Scanner instead of bytes using .Text. Instead of joining a slice of runes with string([]runes), I joined an slice of strings with strings.Join([]strings, "") to form the final blobs of text.

The timing for 10 runs of each on a 23 MB json file was:

  1. 0.65 s
  2. 2.40 s
  3. 0.97 s

So it looks like ReadRune is not too bad after all. It also results in smaller less verbose call because each rune is fetched in 1 operation (.ReadRune) instead of 2 (.Scan and .Bytes).

解决方案

Just read each rune one by one in the loop... See example

这篇关于如何在Go中按字符读取文件字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆