列出与 R 中的全路径模式匹配的所有文件 [英] Listing all files matching a full-path pattern in R

查看:22
本文介绍了列出与 R 中的全路径模式匹配的所有文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取与完整路径 模式匹配的文件列表.到目前为止,我已经使用过 list.files() 但它没有用.

I am trying to obtain the list of files matching a full-path pattern. So far, I have used list.files() but it did not work.

假设我们有以下目录组织:

Let's assume that we have the following directory organization:

results
   |- A
   |  |- data-1.csv
   |  |- data-2.csv
   |
   |- B
      |- data-1.csv
      |- data-2.csv

然后是以下命令:

list.files(pattern='data-.*\.csv', recursive=TRUE)

将返回匹配模式的所有文件.这有效,但在使用 完整路径 模式时会出现问题.例如,如果我想从目录 results/A 中获取所有 CSV 文件,我可以这样做:

will return all the files matching the pattern. This works, but the problem appears when using a full-path pattern. For instance, if I want to obtain all the CSV files from directory results/A, I could do:

list.files(pattern='results/A/data-.*\.csv', recursive=TRUE)

不过这行不通.不知何故,R 似乎无法使用完整路径模式作为正则表达式.在这种情况下,解决方案可能是仅使用 results/A 作为基本路径.但在更复杂的问题中,这是无法做到的.例如,在某些时候我们可能想匹配只包含字符的子目录:

This does not work, though. Somehow, it seems like R is not able to use a full-path pattern as a regular expression. In this case, the solution could be to just use results/A as the base path. But in more complex problems, that cannot be done. For instance, at some point we may want to match the subdirectories containing only characters:

list.files(pattern='results/[A-Z]+/data-.*\.csv', recursive=TRUE)

在 R 中可以这样做吗?

Is it possible to do this in R?

更新:在使用临时解决方案一段时间后,我决定不再一次又一次地输入相同的内容.所以,我创建了一个来简化这个任务.

UPDATE: After using ad hoc solutions for a while, I decided to stop typing the same again and again. So, I created a library for simplifying this task.

推荐答案

首先,请注意您没有使用正则表达式模式.您的第一个示例应该是:

First, note that you are not using regular expression patterns. Your first example should be:

list.files(pattern='data-.*\.csv', recursive=TRUE)

然后,似乎 list.files 中的模式匹配应用于文件基名(即,不包括目录路径),因此您可以将任务拆分为:

Then, it seems the pattern matching inside list.files is applied to the file basenames (i.e., not including the directory path) so you could split the task into:

  1. 仅查找与基本名称匹配的所有文件,返回其完整路径:

  1. Find all files matching the basename only, return their full paths:

basename.matches <- list.files(pattern='data-.*\.csv', recursive=TRUE,
                               full.names = TRUE)
basename.matches
# [1] "./results/A/data-1.csv" "./results/A/data-2.csv" "./results/B/data-1.csv"
# [4] "./results/B/data-2.csv"

  • 只保留那些与预期目录匹配的目录:

  • Keep only those that match the expected directory(ies):

    full.matches <- grep(pattern='^\./results/A/', basename.matches, value = TRUE)
    full.matches
    # [1] "./results/A/data-1.csv" "./results/A/data-2.csv"
    

  • 这篇关于列出与 R 中的全路径模式匹配的所有文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆