捕获主目录中所有子文件夹中的所有csv文件-Python 3.x [英] Capture all csv files within all subfolders in main directory - Python 3.x

查看:214
本文介绍了捕获主目录中所有子文件夹中的所有csv文件-Python 3.x的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下代码用于根据给定的时间值分割csv文件.问题在于此代码无法捕获所有的csv文件.例如在TT1文件夹中有几个子文件夹,而这些子文件夹中也有文件夹.在这些子文件夹中,有csv文件.当我将路径指定为path ='/root/Desktop/TT1时,它不会处理这些子文件夹中的所有文件.我该如何解决这个问题.

Below code is used to split csv files based on a given time value. The problem is this code won't capture all the csv files. For example inside TT1 folder there are several subfolders.And those subfolders have folders inside them. And within those sub-sub-folders there are csv files. When I give the path as path='/root/Desktop/TT1 it wont process all the files within those sub-sub-folders. How can I fix this please.

@Serafeim的回答之后( https://stackoverflow.com/a/57110519/5025009 ),我尝试了这个

AFTER @Serafeim 's answer (https://stackoverflow.com/a/57110519/5025009), I tried this:

import pandas as pd
import numpy as np
import glob
import os

path = '/root/Desktop/TT1/'
mystep = 0.4

#define the function
def data_splitter(df, name):
    max_time = df['Time'].max() # get max value of Time for the current csv file (df)
    myrange= np.arange(0, max_time, mystep) # build the threshold range
    for k in range(len(myrange)):
        # build the upper values 
        temp = df[(df['Time'] >= myrange[k]) & (df['Time'] < myrange[k] + mystep)]
        temp.to_csv("/root/Desktop/T1/{}_{}.csv".format(name, k))

for filename in glob.glob(os.path.join(path, '*.csv')):
    df = pd.read_csv(filename)
    name = os.path.split(filename)[1] # get the name of the file
    data_splitter(df, name)

推荐答案

您可以自动获取所有子文件夹并更改路径: 如果所有子文件夹都以"Sub"开头:

You can get automatically all the subfolders and change the path: If all the subfolders start with "Sub":

import pandas as pd
import numpy as np
import glob
import os

path = '/root/Desktop/TT1/'
mystep = 0.4

#define the function
def data_splitter(df, name):
    max_time = df['Time'].max() # get max value of Time for the current csv file (df)
    myrange= np.arange(0, max_time, mystep) # build the threshold range
    for k in range(len(myrange)):
        # build the upper values 
        temp = df[(df['Time'] >= myrange[k]) & (df['Time'] < myrange[k] + mystep)]
        temp.to_csv("/root/Desktop/T1/{}_{}.csv".format(name, k))

# use os.walk(path) on the main path to get ALL subfolders inside path
for root,dirs,_ in os.walk(path):
    for d in dirs:
        path_sub = os.path.join(root,d) # this is the current subfolder
        for filename in glob.glob(os.path.join(path_sub, '*.csv')):
            df = pd.read_csv(filename)
            name = os.path.split(filename)[1] # get the name of the current csv file
            data_splitter(df, name)

这篇关于捕获主目录中所有子文件夹中的所有csv文件-Python 3.x的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆