带有stringbuilder的java outOfMemoryError [英] java outOfMemoryError with stringbuilder
问题描述
当我调用此方法时,我得到一个java outOfMemoryError - 我在循环中使用它来按顺序解析许多大文件。我的猜测是 result.toString()
在循环过程中没有正确收集垃圾。如果是这样,我该如何解决?
I'm getting a java outOfMemoryError when I call this method - i'm using it in a loop to parse many large files in sequence. my guess is that result.toString()
is not getting garbage collected properly during the loop. if so, how should i fix it?
private String matchHelper(String buffer, String regex, String method){
Pattern abbrev_p = Pattern.compile(regex);//norms U.S.A., B.S., PH.D, PH.D.
Matcher abbrev_matcher = abbrev_p.matcher(buffer);
StringBuffer result = new StringBuffer();
while (abbrev_matcher.find()){
abbrev_matcher.appendReplacement(result, abbrevHelper(abbrev_matcher));
}
abbrev_matcher.appendTail(result);
String tempResult = result.toString(); //ERROR OCCURS HERE
return tempResult;
}
推荐答案
写这个对于文件中的每个字符,你需要大约 6 字节的内存。
Written this way, you'll need roughly 6 bytes of memory for every character in the file.
每个字符都是两个字节。你有原始输入,替换输出(在缓冲区中),当你内存不足时你要求第三个副本。
Each character is two bytes. You have the raw input, the substituted output (in the buffer), and you are asking for a third copy when you run out of memory.
如果文件是编码的在ASCII或ISO-8859-1(单字节字符编码)之类的东西中,这意味着它的内存将比在磁盘上大六倍。
If the file is encoded in something like ASCII or ISO-8859-1 (a single-byte character encoding), that means it will be six times larger in memory than on disk.
你可以为进程分配更多内存,但更好的解决方案可能是流式处理输入—读取,扫描和写入数据,而不是一次性将所有内容全部加载到内存中。
You could allocate more memory to the process, but a better solution might be to process the input "streamwise"—read, scan, and write the data without loading it all into memory at once.
这篇关于带有stringbuilder的java outOfMemoryError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!