使用Apache Commons lineIterator时出现OutOfMemory错误 [英] OutOfMemory error when using Apache Commons lineIterator

查看:101
本文介绍了使用Apache Commons lineIterator时出现OutOfMemory错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Apache Commons FileUtils.lineIterator逐行迭代1.2GB文件.但是,只要LineIterator调用hasNext(),我就会得到一个java.lang.OutOfMemoryError: Java heap space.我已经将1G分配给了Java堆.

I'm trying to iterate line-by-line a 1.2GB file using Apache Commons FileUtils.lineIterator. However, as soon as a LineIterator calls hasNext() I get a java.lang.OutOfMemoryError: Java heap space. I've already allocated 1G to the java heap.

我在这里做错了什么?阅读了一些文档后,LineIterator是否不应该从文件系统读取文件并且不将其加载到内存中?

What am I doing wrong in here? After reading some docs, isn't LineIterator supposed to be reading the file from the file system and not loading it into memory?

请注意,代码在Scala中:

Note the code is in Scala:

  val file = new java.io.File("data_export.dat")
  val it = org.apache.commons.io.FileUtils.lineIterator(file, "UTF-8")
  var successCount = 0L
  var totalCount = 0L
  try {
    while ( {
      it.hasNext()
    }) {
      try {
        val legacy = parse[LegacyEvent](it.nextLine())
        BehaviorEvent(legacy)
        successCount += 1L
      } catch {
        case e: Exception => println("Parse error")
      }
      totalCount += 1
    }
  } finally {
    it.close()
  }

感谢您的帮助!

推荐答案

代码看起来不错.可能它找不到文件中的行尾,并且将一个大于1Gb的非常长的行读入内存.

The code looks good. Probably it does not find an end of a line in the file and reads a very long line which is larger than 1Gb into memory.

在Unix中尝试wc -l,看看会得到多少行.

Try wc -l in Unix and see how many lines you get.

这篇关于使用Apache Commons lineIterator时出现OutOfMemory错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆