与list.files性能问题 [英] Performance problems with list.files
问题描述
我试图使用 list.files
从3个网络驱动器中检索文件,并且需要花费过多的时间。当我在shell中使用 find 的时候,它会在15秒内返回所有的结果。
system.time(
jnk< - list.files(c(/ Volumes / massspec,/ Volumes / massspec2,/ Volumes / massspec3),
pattern ='_ MA _。* _ HeLa_',
recursive = TRUE))
#user system elapsed
#1.567 6.381 309.500
以下是等效的shell命令。
time find / Volumes / masssp * -name * _MA _ * _ HeLa_ *
#real 0m13.776s
#user 0m0.361s
#sys 0m0.620s
我需要一个适用于Windows和Unix系统的解决方案。有没有人有一个好主意?网络驱动器共有大约12万个文件,但大约16TB。所以没有太多的文件,但非常巨大的。
基于评论,我写了一个R函数应该在Windows和Unix ...
$ b $ pre $ quickFileSearch< - function(path,pattern){
switch(.Platform $ OS .type,
unix = {
paths< - paste(path,collapse ='')
command< - paste('find',paths,'-name',pattern)
system(command,intern = TRUE)
},
windows = {
paths< - paste(file.path(path,pattern,
fsep = ('dir',paths,'/ b / s / a-d')
shell(命令)(
collapse ='')
命令< ,intern = TRUE)}
)
}
很多测试,但它是为我的目的。
I am trying to retrieve files from 3 network drives using list.files
and it takes for ever. When I am using find
in the shell it returns all results in less then 15 seconds.
system.time(
jnk <- list.files(c("/Volumes/massspec", "/Volumes/massspec2", "/Volumes/massspec3"),
pattern='_MA_.*_HeLa_',
recursive=TRUE))
# user system elapsed
# 1.567 6.381 309.500
Here is the equivalent shell command.
time find /Volumes/masssp* -name *_MA_*_HeLa_*
# real 0m13.776s
# user 0m0.361s
# sys 0m0.620s
I need a solution which works on Windows and Unix systems. Has anyone a good idea? The network drives have altogether about 120,000 files but about 16TB. So not much files but very huge ones.
Based on the comment, I wrote a little R function which should work on Windows and Unix...
quickFileSearch <- function(path, pattern) {
switch (.Platform$OS.type,
unix={
paths <- paste(path, collapse=' ')
command <- paste('find', paths, '-name', pattern)
system(command, intern=TRUE)
},
windows={
paths <- paste(file.path(path, pattern,
fsep='\\'),
collapse=' ')
command <- paste('dir', paths, '/b /s /a-d')
shell(command, intern=TRUE)}
)
}
The whole thing is not much tested yet but it is working for my purpose.
这篇关于与list.files性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!