如何从目录中读取第n个文件(请不要使用"head -n解决方案")? [英] How to read first n-th files from directory (pleaso NOT a "head -n solution")?
问题描述
我的目录中包含60000多个文件.如何不使用find | head -n
或ls | head -n
解决方案而仅获取N个文件,因为find
和ls
读取此文件列表会花费太多时间.是否有ls
和find
的任何配置,或是否有任何其他程序可以帮助节省时间?
I have a directory with more then 60000 files. How to get only N of them without using a find | head -n
or ls | head -n
solutions, since find
and ls
to read this list of files takes too much time. Are there any configs for ls
and find
or are there any other programs, which can help to safe the time?
推荐答案
其价值所在:
# Create 60000 files
sh$ for i in {0..100}; do
for j in {0..600}; do
touch $(printf "%05d" $(($i+$j*100)));
done;
done
在Linux Debian Wheezy x86_64 w/ext4文件系统上:
On Linux Debian Wheezy x86_64 w/ext4 file system:
sh$ time bash -c 'ls | head -n 50000 | tail -10'
49990
49991
49992
49993
49994
49995
49996
49997
49998
49999
real 0m0.248s
user 0m0.212s
sys 0m0.024s
sh$ time bash -c 'ls -f | head -n 50000 | tail -10'
27235
02491
55530
44435
24255
47247
16033
45447
18434
35303
real 0m0.051s
user 0m0.016s
sys 0m0.028s
sh$ time bash -c 'find | head -n 50000 | tail -10'
./02491
./55530
./44435
./24255
./47247
./16033
./45447
./18434
./35303
./07658
real 0m0.051s
user 0m0.024s
sys 0m0.024s
sh$ time bash -c 'ls -f | sed -n 49990,50000p'
30950
27235
02491
55530
44435
24255
47247
16033
45447
18434
35303
real 0m0.046s
user 0m0.032s
sys 0m0.016s
当然,以下两个速度更快,因为它们只接受 first 项(并且一旦需要行",它们就会用断管来中断配对过程.已被阅读):
Of course, the following two are faster, as they only take the first entries (and they interrupt the pair process with a broken pipe once the required "lines" have been read):
sh$ time bash -c 'ls -f | sed 1000q >/dev/null'
real 0m0.008s
user 0m0.004s
sys 0m0.000s
sh$ time bash -c 'ls -f | head -1000>/dev/null'
real 0m0.008s
user 0m0.000s
sys 0m0.004s
使用sed
足够有趣(?),我们将时间花在用户空间处理上,而使用head
则花费在sys中.经过几次运行,结果是一致的...
Interestingly enough (?) with sed
we spend our time in user space process, whereas with head
it is in sys. After several runs, the results are consistent...
这篇关于如何从目录中读取第n个文件(请不要使用"head -n解决方案")?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!