在python中导入多个excel文件,进行操作,然后在同一目录中导出多个文件 [英] Import multiple excel files in python, manipulate, and then export multiple files in same directory

查看:51
本文介绍了在python中导入多个excel文件,进行操作,然后在同一目录中导出多个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在同一文件夹中的50个不同的excel文件中有50个人的一些数据.对于每个人,数据都存在五个不同的文件中,如下所示:

I have some data of 50 people in 50 different excel files placed in the same folder. For each person the data is present in five different files like shown below:

示例:Person1_a.xls,Person1_b.xls,Person1_c.xls,Person1_d.xls,Person1_e.xls.

Example: Person1_a.xls, Person1_b.xls, Person1_c.xls, Person1_d.xls, Person1_e.xls.

每个excel工作表都有两列和多个工作表.我需要创建一个文件Person1.xls,它将所有这些文件的第二列组合在一起.相同的过程应适用于所有50个人.

Each excel sheet has two columns and multiple sheets. I need to create a file Person1.xls which will have the second column of all these files, combined. Same process should be applicable for all the 50 people.

任何建议将不胜感激.

谢谢!

推荐答案

我创建了一个与您的文件夹相似的试用文件夹.我只添加了Person1和Person3的数据.

I have created a trial folder that I believe is similar to yours. I added data only for Person1 and Person3.

在所附图片中,名为 Person1 Person3 的文件是导出的文件,每个人仅包含第二列.所以每个人现在都有自己的文件.

In the attached picture, the files called Person1 and Person3 are the exported files that include only the 2nd column for each person. So each person has their own file now.

我在每行的内容上做了一个简短的描述.如果有不清楚的地方请告诉我.

I added a small description on what each line does. Please let me know if something is not clear.

import pandas as pd
import glob

path = r'C:\..\trial' # use your path where the files are
all_files = glob.glob(path + "/*.xlsx") # will get you all files with an extension .xlsx in a folder

li = []
for i in range(0,51): # numbers from 1 to 50 (for the 50 different people)
    for f in all_files:
        if str(i) in f: # checks if the number (i) is in the excel name
            df = pd.read_excel(f,
                                 sheet_name=0, # import 1st sheet
                                 usecols=([1])) # only import column 2
            df['person'] = f.rsplit('\\',1)[1].split('_')[0] # get the name of the person in a column
            li.append(df) # add it to the list of dataframes

all_person = pd.concat(li, axis=0, ignore_index=True)  # concat all dataframes imported      

然后,您可以导出到同一路径,为每个不同的人导出一个不同的excel文件

Then you can export to the same path, a different excel file for each different person

for i,j in all_person.groupby('person'):
    j.to_excel(f'{path}\{i}.xlsx', index = False)

我知道这可能不是最有效的方法,但是它可能会为您提供所需的东西.

I am aware that this is probably not the most efficient way, but it will probably get you what you need.

这篇关于在python中导入多个excel文件,进行操作,然后在同一目录中导出多个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆