java-内存使用情况 [英] java - memory usage
问题描述
我正在开发一个加载大量数据(例如来自csv的数据)的应用程序.
I'm developing an application which loads lots of data (like from csv).
我正在创建List<List<SimpleCell>>
并将读取的单元格加载到其中.
SimpleCell类包含5 * String
,每个String
平均具有10个字符.
I'm creating List<List<SimpleCell>>
and loading into it the readed cells.
SimpleCell class contains 5 * String
, every String
have on average 10 characters.
所以我在想,如果我读取1000行-每行包含160列-给出1000 * 160 = 160 000 SimpleCell
的实例-大约是160000 * sizeof(SimpleCell.class)
=〜160 000 * 10 * 5 = 8000000字节=〜7.63 MB.
So I'm thinking that if I read 1000 rows - each containing 160 columns - that gives 1000*160=160 000 SimpleCell
's instances - it'll be something about 160 000 * sizeof(SimpleCell.class)
=~ 160 000 * 10 * 5 = 8 000 000 bytes =~ 7.63 MB.
但是,当我查看jconsole时(单击Perform GC
后),内存使用量约为790MB.怎么可能呢?
But when I'm looking at jconsole (and after clicking Perform GC
) memory usage is something about 790MB. How could this be?
注意,我不存储任何对临时"对象的引用. 这是内存使用量增加时的代码:
Note that I don't store any references to any "temporary" objects. Here is the code when the memory usage rises:
for(int i = r.getFromIndex(); i <= r.getToIndex(); ++i) {
System.out.println("Processing: 'ZZ " + i + "'");
List<SimpleCell> values = saxRead("ZT/ZZ " + i + "");
rows.add(values);
}
saxRead
只是创建inputStream并使用SAX对其进行解析,关闭流,然后返回单元格(由SAXHandler创建)-因此只有局部变量(我认为这些变量会在未来"附近被废弃).
saxRead
just creates inputStream parses it with SAX, closes stream, and returns cells (created by SAXHandler) - so there are only local variables (that I think will be garbaged in the near 'future').
读取1000行时得到out of heap error
,但是我必须读取大约7k.
I'm getting out of heap error
when reading 1000 rows but I must read approximately 7k.
很明显-关于jvm内存我不知道. 那么,为什么在加载相对少量的数据时内存使用量如此之大?
Obviously - there's something that I don't know about jvm memory. So why memory usage is so huge when loading this relatively small amount of data?
推荐答案
字符串使用48个字节加上文本的大小*2.(每个字符为2个字节)简单单元格"对象使用40个字节及其列表使用1064个字节.
A String uses 48 bytes plus the size of the text * 2. (Each character is 2 bytes) The Simple Cell object uses 40 bytes and the List of them uses 1064 bytes.
这意味着每行使用1064 + 160 * 40 + 5 * 180 *(48 + 20)个字节或大约68K.如果您有1000行,您将使用大约70 MB,比您看到的要少得多.
This means each row uses 1064 + 160 * 40 + 5 * 180 * (48 + 20) bytes or about 68K. If you have 1000 lines you will be using about 70 MB which is much less than what you are seeing.
我建议您使用内存配置文件来确切查看内存使用了多少内存.例如VisualVM或YourKit.
I suggest you use a memory profile to see exactly how much memory is being used by what. e.g. VisualVM or YourKit.
根据构造字符串的方式,您所保留的内存甚至比这还多.例如,很可能您要保留对原始XML的引用,因为当您获取它的substring
时,实际上是在保留原始XML的副本.
Depending on how you construct the Strings you retain even more memory than this. For example its likely you are retaining a reference to the original XML as when you take a substring
of it, you are actually holding a copy of the original.
您可能会发现此类很有用.如果它们使用的字符串超出了需要的数量,它将减少使用的内存量,并使用固定大小的缓存来减少重复项.
You may find this class useful. It will reduce the amount of memory Strings use if they are using more than they need and reduce duplicates using a fixed size cache.
static class StringCache {
final WeakReference<String>[] strings;
final int mask;
@SuppressWarnings("unchecked")
StringCache(int size) {
int size2 = 128;
while (size2 < size)
size2 *= 2;
strings = new WeakReference[size2];
mask = size2 - 1;
}
public String intern(String text) {
if (text.length() == 0) return "";
int hash = text.hashCode() & mask;
WeakReference<String> wrs = strings[hash];
if (wrs != null) {
String ret = wrs.get();
if (text.equals(ret))
return ret;
}
String ret = new String(text);
strings[hash] = new WeakReference<String>(ret);
return ret;
}
}
这篇关于java-内存使用情况的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!