Python根据部分名称和文件时间戳读取文本文件 [英] Python read text file based on partial name and file timestamp

查看:474
本文介绍了Python根据部分名称和文件时间戳读取文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将两个相同文件放在不同数据帧中的python中,最终目的是比较在新文件中添加的内容和从旧文件中删除的内容.到目前为止,我的代码看起来像这样:

I'm trying to pull two of the same files into python in different dataframes, with the end goal of comparing what was added in the new file and removed from the old. So far, I've got code that looks like this:

In[1] path = r'\\Documents\FileList'
      files = os.listdir(path)

In[2] files_txt = [f for f in files if f[-3:] == 'txt']

In[3] for f in files_txt:
          data = pd.read_excel(path + r'\\' + f)
          df = df.append(data)

我还设置了一个变量,使其等于当前日期减去一定天数,我想用它来提取日期等于该变量的文件:

I've also set a variable to equal the current date minus a certain number of days, which I want to use to pull the file that has a date equal to that variable:

d7 = dt.datetime.today() - timedelta(7)

到目前为止,我不确定如何执行此操作,因为文件名的第一部分始终保持不变,但是它们在末尾添加数字(例如,file_03232016,然后是file_03302016).我想在目录中解析文件名的开头部分,如果它与我设置的日期参数匹配,则将其添加到数据框.

As of now, I'm unsure of how to do this, as the first part of the filename always remains the same but they add numbers at the end (eg. file_03232016 then file_03302016). I want to parse through the directory for the beginning part of the filename and add it to a dataframe if it matches the date parameter I set.

我忘了补充一点,有时我还需要查看系统日期创建的时间戳记,因为文件名中的文本日期并不总是存在.

I forgot to add that sometimes I also need to look at the system date created timestamp, as the text date in the file name isn't always there.

推荐答案

以下是对原始代码的一些修改,以获取包含目标日期的文件列表.您需要使用 strftime .

Here are some modifications to your original code to get a list of files containing your target date. You need to use strftime.

import os
from datetime import timedelta

d7 = dt.datetime.today() - timedelta(7)
target_date_str = d7.strftime('_%m%d%Y')

files_txt = [f for f in files if f[-13:] == target_date_str + '.txt']

>>> target_date_str + '.txt'
'_03232016.txt'

data = []
for f in files_txt:
      data.append(pd.read_excel(os.path.join(path,  f))
df = pd.concat(data, ignore_index=True)

这篇关于Python根据部分名称和文件时间戳读取文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆