与list.files性能问题 [英] Performance problems with list.files

查看:125
本文介绍了与list.files性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用 list.files 从3个网络驱动器中检索文件,并且需要花费过多的时间。当我在shell中使用 find 的时候,它会在15秒内返回所有的结果。

  system.time(
jnk< - list.files(c(/ Volumes / massspec,/ Volumes / massspec2,/ Volumes / massspec3),
pattern ='_ MA _。* _ HeLa_',
recursive = TRUE))
#user system elapsed
#1.567 6.381 309.500

以下是等效的shell命令。

  time find / Volumes / masssp * -name * _MA _ * _ HeLa_ * 
#real 0m13.776s
#user 0m0.361s
#sys 0m0.620s

我需要一个适用于Windows和Unix系统的解决方案。有没有人有一个好主意?网络驱动器共有大约12万个文件,但大约16TB。所以没有太多的文件,但非常巨大的。

解决方案

基于评论,我写了一个R函数应该在Windows和Unix ...
$ b $ pre $ quickFileSearch< - function(path,pattern){
switch(.Platform $ OS .type,
unix = {
paths< - paste(path,collapse ='')
command< - paste('find',paths,'-name',pattern)
system(command,intern = TRUE)
},
windows = {
paths< - paste(file.path(path,pattern,
fsep = ('dir',paths,'/ b / s / a-d')
shell(命令)(
collapse ='')
命令< ,intern = TRUE)}

}

很多测试,但它是为我的目的。


I am trying to retrieve files from 3 network drives using list.files and it takes for ever. When I am using find in the shell it returns all results in less then 15 seconds.

system.time(
  jnk <- list.files(c("/Volumes/massspec", "/Volumes/massspec2", "/Volumes/massspec3"), 
                    pattern='_MA_.*_HeLa_', 
                    recursive=TRUE))
#   user  system elapsed 
#  1.567   6.381 309.500 

Here is the equivalent shell command.

time find /Volumes/masssp* -name *_MA_*_HeLa_*
# real  0m13.776s
# user  0m0.361s
# sys   0m0.620s

I need a solution which works on Windows and Unix systems. Has anyone a good idea? The network drives have altogether about 120,000 files but about 16TB. So not much files but very huge ones.

解决方案

Based on the comment, I wrote a little R function which should work on Windows and Unix...

quickFileSearch <- function(path, pattern) {
  switch (.Platform$OS.type,
          unix={
            paths <- paste(path, collapse=' ')
            command <- paste('find', paths, '-name', pattern)
            system(command, intern=TRUE)
          },
          windows={
            paths <- paste(file.path(path, pattern, 
                                     fsep='\\'),
                           collapse=' ')
            command <- paste('dir', paths, '/b /s /a-d')
            shell(command, intern=TRUE)}
  )
}

The whole thing is not much tested yet but it is working for my purpose.

这篇关于与list.files性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆