选择目录中的文件,然后根据文件名的文本列表移动它们 [英] Select files in directory and move them based on text list of filenames

查看:160
本文介绍了选择目录中的文件,然后根据文件名的文本列表移动它们的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我在/ path中有一个包含数千个pdf文件的文件夹,并且有一个包含数百个名为names.csv的名称的列表(只有一列,它很容易就是.txt)。

So I have a folder of a few thousand pdf files in /path, and I have a list of hundreds of names called names.csv (only one column, it could just as easily be .txt).

我试图选择(理想情况下,移动)pdf,其中names.csv中的任何名称都可以在任何文件名中找到。

I'm trying to select (and ideally, move) the pdfs, where any name from names.csv is found in any filename.

从到目前为止的研究来看,似乎listdir和regex是至少获取我想要的文件列表的一种方法:

From my research so far, it seems like listdir and regex is one approach to at least get a list of the files I want:

import os, sys  
import re 


for files in os.listdir('path'):
    with open('names.csv') as names: 
        for name in names:
            match  = re.search(name, files)

        print match  

但是目前这只是返回'None''None'等,一直下降。

But currently this is just returning 'None' 'None' etc, all the way down.

我可能在这里做错了很多事情。而且,我甚至都不需要移动文件。但我只是希望克服这个第一个障碍。

I'm probably doing a bunch of things wrong here. And I'm not even near the part where I need to move the files. But I'm just hoping to get over this first hump.

任何建议都非常感谢!

推荐答案

问题是您的 name 变量始终以换行符 \n 结尾。文件名中不存在换行符,因此正则表达式找不到任何匹配项。

The problem is that your name variable always ends with a newline character \n. The newline character isn't present in the file names, so regex doesn't find any matches.

您的代码还存在其他一些小问题:

There are also a few other small issues with your code:


  • 您将在循环的每次迭代中打开 names.csv 文件。一次打开文件,然后遍历目录中的所有文件,效率会更高。

  • 这里不需要使用正则表达式,实际上可能会引起问题。例如,如果您的csv文件中的一行看起来像(这不是有效的正则表达式,则您的代码将引发异常。此问题可以由转义,但是仍然不需要正则表达式。

  • 您的打印匹配项放置在错误的位置,因为 match 在每次迭代中均被覆盖循环,然后在循环后打印它的值 ,您只会看到它的最后一个值。

  • You're opening the names.csv file in each iteration of the loop. It would be more efficient to open the file once, then loop through all files in the directory.
  • Regex isn't necessary here, and in fact can cause problems. If, for example, a line in your csv file looked like (this isn't a valid regex, then your code would throw an exception. This could be fixed by escaping it first, but regex still isn't necessary.
  • Your print match is in the wrong place. Since match is overwritten in each iteration of the loop, and you're printing its value after the loop, you only get to see its last value.

固定代码如下:

import os

# open the file, make a list of all filenames, close the file
with open('names.csv') as names_file:
    # use .strip() to remove trailing whitespace and line breaks
    names= [line.strip() for line in names_file] 

for filename in os.listdir('path'):
    for name in names:
        # no need for re.search, just use the "in" operator
        if name in filename:
             # move the file
             os.rename(os.path.join('path', filename), '/path/to/somewhere/else')
             break

这篇关于选择目录中的文件,然后根据文件名的文本列表移动它们的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆