第一次运行后,由于页面缓存,os.walk()快得多吗? [英] Is os.walk() much faster after the first run due to page caching?
问题描述
我使用 os.walk
遍历1000个文件(只是迭代,对这些文件没有任何处理).第一次运行很慢,但随后的运行(在同一路径上)要快20倍.
I use os.walk
to iterate over, say, 1000 files (just iteration, no process is done on these files).
The first run is slow, but subsequent runs (on the same path) are about 20 times faster.
据我所知, os.walk
和 os.listdir
(由 os.walk
使用)没有任何缓存,也没有 FindFirstFile
/ FindNextFile
(由Windows平台上的 os.listdir
使用).
As far as I know, os.walk
and os.listdir
(which is used by os.walk
) didn't do any caching, nor the FindFirstFile
/FindNextFile
(which is used by os.listdir
on my Windows platform).
这是由于页面缓存还是其他原因引起的?
So is this due to page caching or some thing else?
仅供参考,我正在尝试编写一个备份应用程序,并且需要处理大量文件.如果确实是由于页面缓存引起的,那么我需要编写自己的缓存机制.
FYI, I'm trying to write a backup application and need to process huge number of files. If it's indeed due to page caching, then I'll need to write my own caching mechanism.
推荐答案
您的操作系统在此处进行缓存;目录查找需要很慢的磁盘访问,因此这种访问被大量缓存.
Your OS does the caching here; directory lookups require disk access which is slow, so such access is heavily cached.
例如, ntfs.sys
驱动程序使用数据映射服务缓存文件系统元数据,例如目录列表.
For example, the ntfs.sys
driver uses the Data Map service to cache filesystem metadata such as directory listings.
这篇关于第一次运行后,由于页面缓存,os.walk()快得多吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!