从FTP服务器检索文件的修改的DateTime [英] Retrieve modified DateTime of a file from an FTP Server

查看:70
本文介绍了从FTP服务器检索文件的修改的DateTime的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以找到R中FTP服务器上文件的修改日期/时间?我找到了列出所有可用文件的好方法,但是我只想下载自上次检查以来已更新的文件.我尝试使用:

Is there a way to find the modified date/time for files on an FTP server in R? I have found a great way to list all of the files that are available, but I only want to download ones that have been updated since my last check. I tried using:

info<-file.info(url)

但是,它返回一个非常丑陋的列表.我的网址由以下内容组成:" ftp://username:password@FTPServer//filepath.xml "

However, it returns a pretty ugly list of nothing. My url is made up of: "ftp://username:password@FTPServer//filepath.xml"

推荐答案

直到我们看到此特定 FTP服务器(它们都是不同的)的FTP输出(用于目录列表),这是您可以遵循的路径:

Until we see the output from this particular FTP server (they are all different) for directory listings, here's a path you can follow:

library(curl)
library(stringr)

获取原始目录列表:

con <- curl("ftp://ftp.FreeBSD.org/pub/FreeBSD/")
dat <- readLines(con)
close(con)
dat

## [1] "-rw-rw-r--    1 ftp      ftp          4259 May 07 16:18 README.TXT" 
## [2] "-rw-rw-r--    1 ftp      ftp            35 Sep 09 21:00 TIMESTAMP"  
## [3] "drwxrwxr-x    9 ftp      ftp            11 Sep 09 21:00 development"
## [4] "-rw-r--r--    1 ftp      ftp          2566 Sep 09 10:00 dir.sizes"  
## [5] "drwxrwxr-x   28 ftp      ftp            52 Aug 23 10:44 doc"        
## [6] "drwxrwxr-x    5 ftp      ftp             5 Aug 05 04:16 ports"      
## [7] "drwxrwxr-x   10 ftp      ftp            12 Sep 09 21:00 releases"   

过滤目录:

no_dirs <- grep("^d", dat, value=TRUE, invert=TRUE)
no_dirs

## [1] "-rw-rw-r--    1 ftp      ftp          4259 May 07 16:18 README.TXT"
## [2] "-rw-rw-r--    1 ftp      ftp            35 Sep 09 21:00 TIMESTAMP" 
## [3] "-rw-r--r--    1 ftp      ftp          2566 Sep 09 10:00 dir.sizes" 

仅提取时间戳和文件名:

Extract just the timestamp and filename:

date_and_name <- sub("^[[:alnum:][:punct:][:blank:]]{43}", "", no_dirs)
date_ane_name
## [1] "May 07 16:18 README.TXT"
## [2] "Sep 09 21:00 TIMESTAMP" 
## [3] "Sep 09 10:00 dir.sizes" 

将它们放入 data.frame :

do.call(rbind.data.frame, 
        lapply(str_match_all(date_and_name, "([[:alnum:] :]{12}) (.*)$"), 
               function(x) {
                 data.frame(timestamp=x[2],
                            filename=x[3], 
                            stringsAsFactors=FALSE)
})) -> dat
dat

##      timestamp   filename
## 1 May 07 16:18 README.TXT
## 2 Sep 09 21:00  TIMESTAMP
## 3 Sep 09 10:00  dir.sizes

您仍然需要将时间戳转换为 POSIXct ,但这很简单.

You still need to convert the timestamp to a POSIXct but that's trivial.

此特定示例取决于该系统的FTP目录列表响应.只需为您更改正则表达式即可.

This particular example is dependent on that system's FTP directory listing response. Just change the regexes for yours.

这篇关于从FTP服务器检索文件的修改的DateTime的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆