.NET进程的内存转储中的大量无法解释的内存 [英] Large unexplained memory in the memory dump of a .NET process

查看:302
本文介绍了.NET进程的内存转储中的大量无法解释的内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法解释C#进程使用的大部分内存.总内存为10 GB,但可访问和不可访问的对象总数总计为2.5 GB.我想知道这7.5 GB可能是多少?

I can't explain most of the memory used by a C# process. The total memory is 10 GB, but the total reachable and unreachable objects altogether total 2.5 GB. I wonder what these 7.5 GB could be?

我正在寻找最可能的解释或一种方法来找出此内存可以是什么.

I'm looking for the most likely explanations or a method to find out what this memory can be.

这是确切的情况.该过程是.NET 4.5.1.它从互联网下载页面并通过机器学习对其进行处理.如VMMap所示,内存几乎完全位于托管堆中.这似乎可以排除非托管内存泄漏.

Here is the precise situation. The process is .NET 4.5.1. It downloads pages from internet and process them with machine learning. The memory is almost entirely in the Managed Heap as shown by VMMap. This seems to rule out unmanaged memory leak.

该进程已经运行了几天,并且内存缓慢增长.在某些时候,内存为11 GB.我停止所有正在运行的程序.我运行包括大型对象堆压缩多次(间隔为一分钟):

The process has been running for days and the memory slowly grew. At some point, the memory is 11 GB. I stop everything running in the process. I run garbage collections including large object heap compaction several times (with one minute of interval):

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();

内存下降到10 GB.然后创建转储:

The memory goes down to 10 GB. Then I create the dump:

procdump -ma psid

procdump -ma psid

转储为预期的10 GB.

The dump is 10 GB as expected.

我使用 .NET内存分析器(版本5.6)打开转储.转储显示总共2.2 GB的可访问对象和0.3 GB的不可达对象. 如何解释剩余的7.5 GB?

I open the dump with .NET memory profiler (version 5.6). The dump shows a total of 2.2 GB reachable objects and 0.3 GB unreachable objects. What could explain the remaining 7.5 GB ?

我一直在想的可能解释:

Possible explanations I've been thinking of :

  • LOH并没有完全压缩
  • 在探查器显示的对象之外使用了一些内存

推荐答案

调查后,问题出在由于固定缓冲区导致的堆碎片.我将说明如何调查以及什么是固定缓冲区.

After investigation, the problem happens to be heap fragmentation because of pinned buffers. I'll explain how to investigate and what pinned buffers are.

我曾经使用过的所有探查器都同意说大部分堆都是免费的.现在,我需要查看碎片.我可以使用WinDbg做到这一点,例如:

All profilers I've used agreed to say most of the heap is free. Now I needed to look at fragmentation. I can do it with WinDbg for example:

!dumpheap -stat

然后,我查看了大于...的碎片块"部分. WinDbg说对象位于空闲块之间,因此无法进行压缩.然后,我查看了持有这些对象的对象以及它们是否被固定,例如,这里的对象位于地址0000000bfaf93b80:

Then I looked at the "Fragmented blocks larger than..." section. WinDbg says objects lie between the free blocks making compaction impossible. Then I looked at what is holding these objects and if they are pinned, here for example object at address 0000000bfaf93b80:

!gcroot 0000000bfaf93b80

它显示参考图:

00000004082945e0 (async pinned handle)
-> 0000000535b3a3e0 System.Threading.OverlappedData
-> 00000006f5266d38 System.Threading.IOCompletionCallback
-> 0000000b35402220 System.Net.Sockets.SocketAsyncEventArgs
-> 0000000bf578c850 System.Net.Sockets.Socket
-> 0000000bf578c900 System.Net.SocketAddress
-> 0000000bfaf93b80 System.Byte[]

00000004082e2148 (pinned handle)
-> 0000000bfaf93b80 System.Byte[]

最后两行告诉您对象已固定.

The last two lines tell you the object is pinned.

固定对象是不能移动的缓冲区,因为它们的地址与非托管代码共享.在这里您可以猜到它是系统TCP层.当托管代码需要将缓冲区的地址发送到外部代码时,它需要固定"缓冲区,以便该地址保持有效:GC无法移动它.

Pinned objects are buffers than can't be moved because their address is shared with non-managed code. Here you can guess it is the system TCP layer. When managed code needs to send the address of a buffer to external code, it needs to "pin" the buffer so that the address remains valid: the GC cannot move it.

这些缓冲区虽然仅占内存的一小部分,但无法进行压缩,因此会导致较大的内存泄漏",即使这并非完全是泄漏,也会带来碎片问题.这可以在LOH或世代相同的堆上发生.现在的问题是:是什么导致这些固定的对象永远存在:找到导致碎片的泄漏的根本原因.

These buffers, while being a very small part of the memory make compaction impossible and thus cause large memory "leak", even if it is not exactly a leak, more a fragmentation problem. This can happen on the LOH or on generational heaps just the same. Now the question is: what is causing these pinned objects to live forever: find the root cause of the leak that causes the fragmentation.

您可以在此处阅读类似的问题:

You can read similar questions here:

.NET删除固定分配的缓冲区(固定对象的详细说明在答案中)

.NET deletes pinned allocated buffer (good explanation of pinned objects in the answer)

注意:根本原因在于第三方库 AerospikeClient 使用已知的.NET异步套接字API固定发送给它的缓冲区.尽管AerospikeClient正确使用了缓冲池,但在重新创建其客户端时会重新创建该缓冲池.由于我们不是每小时创建一个客户端,而是每个小时重新创建一个客户端,因此重新创建了缓冲池,从而导致固定缓冲区的数量不断增加,进而导致了无限的碎片.仍然不清楚的是,为什么在传输结束时或至少在处置客户端时,永远不会取消固定旧缓冲区.

Note: the root cause was in a third party library AerospikeClient using the .NET async Socket API that is known for pinning the buffers sent to it. While AerospikeClient properly used a buffer pool, the buffer pool was re-created when re-creating their client. Since we re-created their client every hour instead of creating one forever, the buffer pool was re-created, causing a growing number of pinned buffers, in turn causing unlimited fragmentation. What remains unclear is why old buffers are never unpinned when transmission is over or at least when their client is disposed.

这篇关于.NET进程的内存转储中的大量无法解释的内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆