是在运行时从磁盘还是从内存加载.NET程序集中的嵌入式资源? [英] Are embedded resources in a .NET Assembly loaded from disk or from memory at runtime?

查看:56
本文介绍了是在运行时从磁盘还是从内存加载.NET程序集中的嵌入式资源?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我使用 GetManifestResourceStream 从.NET程序集检索嵌入式资源时,涉及哪种I/O?

我看到两种可能性:

  1. .NET加载时,整个程序集已经放入内存中,因此 GetManifestResourceStream 只是在访问内存.

  2. .NET加载程序集时,仅将程序集的代码部分放入内存中,因此 GetManifestResourceStream 需要返回到 .dll 文件以提取嵌入的资源.

我很确定第一种情况是这样,尤其是因为可以使用 Assembly.Load(Byte [])从原始数据动态加载程序集.但是我不知道如果嵌入一个非常大的文件(例如几GB),会发生什么情况-第二种选择可能会更有效.大小重要吗?

仅挑战一些长期存在的假设,并且无法在此作为参考.

解决方案

内存"在按需分页的虚拟内存操作系统(例如Windows,Linux,MacOS)上不够精确.CLR使用内存映射文件(MMF)将程序集映射到进程的地址空间.只是处理器的编号,每4096字节一个.尚未从文件中读取任何内容.

这被延迟,直到程序尝试从地址空间内的地址读取.首次访问会产生页面错误,内核会为页面分配RAM并用文件内容填充它.之后,程序将恢复运行,好像什么也没发生.强大地发挥了虚拟内存的您不用为不使用的东西付费"的优势.

没有提取",您是直接从内存中读取资源数据,这是本可以实现的最有效方法.嵌入式资源的行为与文件中的其他数据(如元数据和MSIL)没有任何不同.同样,您无需为从未调用的程序集中的任何代码付费.

请记住,嵌入式资源与GC堆占用相同的OS资源,它也需要地址空间.唯一真正的区别是GC堆地址空间由OS页面文件支持,并且永远不能与其他进程共享,程序集数据由程序集文件支持并且可以共享.大型资源会显着减少您可以在.NET程序中分配的内存量,即使您从未使用过它们也是如此.这仅在32位进程中很重要,而64位进程具有许多TB的地址空间.

另一个限制是,即使在64位进程中,MMF视图也不能大于2 GB,这会为资源的最大大小设置硬上限.通常,这很早就扭曲了,使CS1566的构建失败,指定的参数超出有效值范围".顺便说一句,它并不是一个很好的诊断工具.

When I use GetManifestResourceStream to retrieve an embedded resource from a .NET assembly, what kind of I/O is involved?

I see two possibilities:

  1. The entire assembly was already put into memory when .NET loaded it, so GetManifestResourceStream is just accessing memory.

  2. Only the code parts of the assembly were put into memory when the assembly was loaded by .NET, so GetManifestResourceStream needs to go back to the .dll file to extract the embedded resource.

I'm pretty sure the first is the case, especially since assemblies can be loaded dynamically from raw data with Assembly.Load(Byte[]). But then I wonder what happens if a very large file (say several gigabytes) was embedded - the second option might be more efficient. Does size matter?

Just challenging some long-held assumptions, and not able to find much in the way of reference on this.

解决方案

"Memory" is not a precise enough term on a demand-paged virtual memory operating system like Windows, Linux, MacOS. The CLR maps the assembly into the address space of the process using a memory-mapped file (MMF). Just numbers to the processor, one each for every 4096 bytes. Nothing is read from the file just yet.

That is delayed until the program tries to read from an address inside the address space. First access generates a page fault, the kernel allocates RAM for the page and fills it with the file content. After which the program resumes as though nothing happened. Strongly empowers the "you don't pay for what you don't use" advantage of virtual memory.

There is no "extraction", you are reading the resource data directly from memory, most efficient way it could have been implemented. An embedded resource does not otherwise behave any differently from other data in the file, like the metadata and the MSIL. You likewise don't pay for any code in the assembly that you never call.

Do keep in mind that an embedded resource occupies the same OS resource as the GC heap, it too requires address space. Only real difference is that GC heap address space is backed by the OS paging file and can never be shared with other processes, the assembly data is backed by the assembly file and can be shared. Large resources notably shrink the amount of memory you can allocate in a .NET program, even if you never use them. That matters only in a 32-bit process, a 64-bit process has many terabytes of address space.

Another restriction is that an MMF view can never be larger than 2 GB, even in a 64-bit process, that sets a hard upper limit on the maximum size of a resource. That usually keels over very early, failing the build with CS1566, "Specified argument was out of the range of valid values". Not a great diagnostic btw.

这篇关于是在运行时从磁盘还是从内存加载.NET程序集中的嵌入式资源?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆