目录漫游而不是os.listdir的更快方法? [英] A Faster way of Directory walking instead of os.listdir?

查看:118
本文介绍了目录漫游而不是os.listdir的更快方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试提高基于ajax的文件管理器(elRTE.ru)elfinder的性能。

I am trying to improve performance of elfinder , an ajax based file manager(elRTE.ru) .

它使用os.listdir进行遍历所有内容递归目录并影响性能(例如列出包含3000个文件的目录需要7秒钟)..

It uses os.listdir in a recurisve to walk through all directories recursively and having a performance hit (like listing a dir with 3000 + files takes 7 seconds ) ..

我正在尝试提高其性能,因为它是步行功能:

I am trying to improve performance for it here is it's walking function:

        for d in os.listdir(path):
            pd = os.path.join(path, d)
            if os.path.isdir(pd) and not os.path.islink(pd) and self.__isAccepted(d):
                tree['dirs'].append(self.__tree(pd))

我的问题是:


  1. 如果我更改os.walk而不是os.listdir,会提高性能吗?

  2. 如何使用dircache.listdir()呢?如果没有上传新文件或文件没有更改,则在初始请求时缓存整个目录/子目录的内容并返回缓存结果?

  3. 是否有其他更快的目录遍历方法?

  4. 是否有其他用python快速编写的服务器端文件浏览器(但我更喜欢快速制作此文件)?

  1. If i change os.walk instead of os.listdir , would it improve performance?
  2. how about using dircache.listdir() ? cache WHOLE directory/subdir contents at the initial request and return cache results , if theres no new files uploaded or no changes in file?
  3. Is there any other method of Directory walking which is faster?
  4. Any Other Server Side file browser which is fast written in python (but i prefer to make this one fast)?


推荐答案

我只是想弄清楚如何在较大的文件系统(350,000个文件分布在大约50,000个目录中)上加快os.walk的速度。我在使用ext3文件系统的Linux机器上。我发现有一种方法可以加快我的情况。

I was just trying to figure out how to speed up os.walk on a largish file system (350,000 files spread out within around 50,000 directories). I'm on a linux box usign an ext3 file system. I discovered that there is a way to speed this up for MY case.

具体来说,使用自上而下的行走,只要os.walk返回的列表超过一个目录,我使用os.stat来获取每个目录的inode编号,并按inode编号对目录列表进行排序。这使得walk大部分以inode顺序访问子目录,从而减少了磁盘查找。

Specifically, Using a top-down walk, any time os.walk returns a list of more than one directory, I use os.stat to get the inode number of each directory, and sort the directory list by inode number. This makes walk mostly visit the subdirectories in inode order, which reduces disk seeks.

对于我的用例,它使我完整的目录walk从18分钟缩短到13分钟...

For my use case, it sped up my complete directory walk from 18 minutes down to 13 minutes...

这篇关于目录漫游而不是os.listdir的更快方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆