使用python按部分名称在目录中查找文件 [英] Find a file in a directory using python by partial name

查看:55
本文介绍了使用python按部分名称在目录中查找文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含数十万个文件的目录.

I have a directory with several hundred thousand files in it.

它们都遵循以下格式:

datetime_fileid_metadata_collect.txt

一个具体示例如下:

201405052359559_0002230255_35702088_collect88.txt

我正在尝试编写一个脚本,该脚本会在我提供的是文件ID列表的情况下拉出并复制单个文件.

I am trying to write a script that pulls out and copies individual files when all I provide it is a list of file ids.

例如,我有一个包含此内容的文本文件fileids.txt

For example I have a text document fileids.txt that constains this

fileids.txt
0002230255
0001627237
0001023000

这是我到目前为止编写的示例脚本. file1结果不断返回[]

This is the example script I have written so far. file1 result keeps returning []

import os
import re, glob, shutil
base_dir = 'c:/stuff/tub_0_data/'
destination = 'c:/files_goes_here'
os.chdir(base_dir)
text_file = open('c:/stuff/fileids.txt', 'r')
file_ids = text_file.readlines()
#file_ids = [stripped for stripped in (line.strip() for line in text_file.readlines()) if stripped]
for ids in file_ids:
    id1 = ids.rstrip()
    print 'file id = ',str(id1)
    file1 = glob.glob('*' + str(id1) + '*')
    print str(file1)
    if file1 != []:
        shutil.copy(base_dir + file1, destination)

我知道我还不完全了解glob或正则表达式.如果要根据文件名的特定字符串查找文件,该怎么办?

I know I dont fully understand glob or regular expressions yet. What would I put there if I want to find files based off of a specific string of their filename?

glob.glob('*' + stuff '*') 

用于在文件名中查找内容.问题是不删除行距.

worked for finding things within the filename. Not removing linespace was the issue.

推荐答案

text_file.readlines()读取整行,包括结尾的'\ n'.尝试剥离它.以下内容将删除换行符并删除空容器:

text_file.readlines() reads the entire line including the trailing '\n'. Try stripping it. The following will strip newlines and remove empties:

file_ids = [line.strip() for line in text_file if not line.isspace()]

这篇关于使用python按部分名称在目录中查找文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆