如何缩小内核空间中的Linux页面缓存? [英] How can I shrink the Linux page cache from within kernel space?

查看:101
本文介绍了如何缩小内核空间中的Linux页面缓存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在一个包含一些自定义硬件和我为该硬件编写的自定义Linux设备驱动程序的系统上工作.系统有时需要非常快速地移动大量数据,因此我的驱动程序会动态地(即在需要时)分配较大的(1 GB)DMA缓冲区,这些缓冲区将在使用后再不再需要时释放.为了分配这么大的缓冲区,我实际上使用dma_alloc_coherent分配了一堆较小的缓冲区(256 X 4MB),然后使用remap_pfn_range将它们连续地映射到用户空间.在大多数情况下,这种方法效果很好.

I'm working on a system that involves some custom hardware and a custom Linux device driver I wrote for the hardware. The system occasionally needs to move large amounts of data very rapidly and therefore my driver dynamically (i.e. when needed) allocates large (1 GB) DMA buffers which are used and then freed when they are no longer needed. To allocate such large buffers I actually allocate a bunch of smaller buffers (256 X 4MB) using dma_alloc_coherent and then map them contiguously into user space using remap_pfn_range. This works very well most of the time.

在测试过程中,系统长时间运行测试用例后,有时会看到DMA分配失败,其中驱动程序中的dma_alloc_coherent调用之一失败,这导致我的应用程序层软件崩溃.我终于能够找到这个问题,我发现当我看到DMA分配失败时,Linux内核页面缓存已满.

During testing, after the system has been running test cases for a long time, I sometimes see DMA allocation failures where one of the dma_alloc_coherent calls in my driver fails which causes my application layer software to crash. I was finally able to track down this problem and I discovered that when I see DMA allocation failures the Linux kernel page cache is very full.

例如,在上次捕获页面缓存失败时,系统上的32 GB RAM占据了27 GB.我怀疑页面缓存充满"导致dma_alloc_coherent调用失败.为了验证这一理论,我使用以下方法手动清空了页面缓存:

For example, on the last failure that I captured the page cache filled 27 GB of the 32 GB of RAM on my system. I suspected that the page cache "fullness" was causing dma_alloc_coherent calls to fail. To test this theory I manually emptied the page cache using:

# echo 1 > /proc/sys/vm/drop_caches

这将缓存的大小从27 GB减少到94 MB,并且我能够毫无问题地分配20 + 1 GB的DMA缓冲区.

This dropped the size of the cache from 27 GB to 94 MB and I was able to allocate 20+ 1 GB DMA buffers with no issues.

显然,页面缓存是一件有益的事情,因此,我宁愿不必在分配DMA缓冲区时每次用完空间时都将其完全清空.我的问题是:如何动态缩小内核空间中的页面缓存,以便如果对dma_alloc_coherent的调用失败,我可以恢复到足够的空间,以便我可以重试该调用并使它成功?

Clearly the page cache is a beneficial thing so I would prefer not to have to completely empty it every time I run out of space when allocating DMA buffers. My questions is this: how can I dynamically shrink the page cache in kernel space such that if a call to dma_alloc_coherent fails I can recover just enough space so that I can retry the call and have it succeed?

我的系统是基于x86_64的,运行3.16.x Linux内核.

My system is x86_64 based running a 3.16.x Linux kernel.

我发现一些模糊的参考文献暗示我正在尝试的事情是可能的,例如这些对象是自动 当系统上其他地方需要内存时,由内核回收."(摘自: https: //www.kernel.org/doc/Documentation/sysctl/vm.txt ),但我尚未找到任何指示如何回收内存的细节.

I have found some vague references that suggest what I'm attempting may be possible, for example "These objects are automatically reclaimed by the kernel when memory is needed elsewhere on the system." (from: https://www.kernel.org/doc/Documentation/sysctl/vm.txt). But I have not yet found any specifics that indicate how the memory is reclaimed.

在此方面提供的任何帮助将不胜感激!

Any assistance with this would be greatly appreciated!

推荐答案

TL; DR :扫描活动的超级块并删除对非脏块的引用,直到您回收了尽可能多的系统内存为止需要. (或者您最终用完了对活动超级块的引用.)

TL;DR : Scan for active superblocks and drop references to non-dirty ones until you have reclaimed as much system memory as you need. (or you finally run out of references to active superblocks.)

如何编写内核代码以动态缩小 fs页面缓存
恢复足够的空间,以便随后对dma_alloc_coherent()的调用成功?

How to write kernel code to dynamically shrink the fs page-cache,
to recover just enough space so that a subsequent call to dma_alloc_coherent() succeeds?

要回答这个问题,让我们看一下"drop_caches操作"是如何将fs页面缓存从您的系统上的27GB减少到94MB的.

To answer this question, let us take a look at what the "drop_caches operation" did to reduce the fs page-cache from 27GB to 94MB on your system.

  1. echo 1 > /proc/sys/vm/drop_caches
    调用
    drop_caches_sysctl_handler()

  1. echo 1 > /proc/sys/vm/drop_caches
    invokes
    drop_caches_sysctl_handler()

依次调用 iterate_supers()
将其传递给指向函数 drop_pagecache_sb() 的指针.

which in turn invokes iterate_supers() and
passes it the pointer to the function drop_pagecache_sb().

接下来发生的事情是iterate_supers()扫描活动的超级块,并且每次找到该超级块时,都会调用drop_pagecache_sb(),并将其传递给活动的超级块的引用.

What happens next is that iterate_supers() scans for active superblocks and everytime it finds one, it calls drop_pagecache_sb(), passing it a reference to the active superblock.

此迭代过程将继续进行,直到从fs页面高速缓存中释放对所有活动超级块的引用为止.这是一种非破坏性的操作,只会释放完全未使用的块.脏对象将继续使用,直到将其写到磁盘上并且无法释放.如果先运行sync将它们刷新到磁盘,则"drop_caches操作"往往会释放更多的内存.

This iterative procedure continues until references to all the active superblocks are freed from the fs page-cache. This is a non-destructive operation and will only free blocks that are completely unused. Dirty-objects will continue to be in use until written out to disk and are not free-able. If you run sync first to flush them out to disk, the "drop_caches operation" tends to free more memory.

由于您有兴趣运行此过程以回收有限/已知的内存,即很快将使用dma_alloc_coherent()请求的内容,因此您只需在每个函数的末尾额外进行检查就可以实现上述功能.迭代并在系统可用内存量超过所需水平后立即中止超级块扫描.

Since you are interested in running this process to reclaim a limited/known amount of memory i.e. what is soon going to be requested using dma_alloc_coherent(), you simply need to implement the above functionality with an additional check at the end of each iteration and abort the superblock scan immediately once the amount of free system memory crosses the desired level.


要进一步优化此过程,请记住以下几点:


A couple of points to keep in mind to further optimise this procedure :

  • 某些块设备相对于其他块设备有优先选择吗?
    您可能要遍历您并不首先在意的块设备的活动超级块.如果没有回收足够的内存,请扫描您希望保留在fs页面高速缓存中的块设备,除非绝对有必要回收所需的内存. get_active_super()在这里可能会有所帮助.

  • Is there a preference for certain block devices over others?
    You may want to iterate over active superblocks of the block devices that you do not care about first. If enough memory is not reclaimed, then scan the block devices that you would prefer to retain in the fs page-cache unless absolutely necessary to reclaim required memory. get_active_super() might be of help here.

iterate_supers_type()似乎有趣
它允许一个人迭代特定 file_system_type

iterate_supers_type() seems interesting
It allows one to iterate over superblocks of specific file_system_type

请注意,这是一个纯粹基于对Linux内核中现有代码的分析而得出的推测性解决方案,您已经观察到已经解决了问题.实施上述方法后,它只会允许您控制相同的操作,即仅在满足您即时需求的范围内尝试回收fs页面缓存内存.

Please note that this is a speculative solution based purely on the analysis of existing code within the Linux kernel that you have observed to already solve your problem. Once the above approach is implemented, it will only allow you to control the same i.e. attempt to reclaim fs page-cache memory only to the extent required for your immediate needs.

这篇关于如何缩小内核空间中的Linux页面缓存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆