通过Excel工作表的Python循环，放入一个df [英] Python Loop through Excel sheets, place into one df

查看：155 发布时间：2020/5/23 22:20:50 python excel pandas

本文介绍了通过Excel工作表的Python循环，放入一个df的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个Excel文件foo.xlsx，其中包含约40张sh1，sh2等.每张纸的格式为:

I have an excel file foo.xlsx with about 40 sheets sh1, sh2, etc. Each sheet has the format:

area      cnt   name\nparty1   name\nparty2
blah      9         5               5
word      3         7               5

在每张工作表中，我想将name\nparty格式的变量重命名为仅以party作为标签.输出示例:

In each sheet I want to rename the vars with the format name\nparty to only have the party as a label. Example output:

area      cnt    party1    party2     sheet
bacon     9         5         5        sh1
spam      3         7         5        sh1
eggs      2         18        4        sh2

我正在使用以下文件读取文件:

I am reading in the file with:

book = pd.ExcelFile(path)

然后想知道我是否需要做:

And then wondering if I need to do:

for f in filelist:
    df = pd.ExcelFile.parse(book,sheetname=??)
    'more operations here'
    # only change column names 2 and 3
     i, col in enumerate(df):
     if i>=2 and i<=3:
        new_col_name = col.split("\n")[-1]
        df[new_col_name] =

还是类似的东西?

推荐答案

截至2019-09-09的更新:

将sheet_name用于v0.25.1，而不是sheetname

use sheet_name for v0.25.1 instead of sheetname

如果设置了关键字参数sheetname=None，则pandas的read_excel方法可让您一次读取所有工作表.这将返回一个字典-键是工作表名称，值是作为数据帧的工作表.

The read_excel method of pandas lets you read all sheets in at once if you set the keyword parameter sheetname=None. This returns a dictionary - the keys are the sheet names, and the values are the sheets as dataframes.

使用此方法，我们可以简单地遍历字典和:

Using this, we can simply loop through the dictionary and:

在包含相关工作表名称的数据框中添加额外的列
使用rename方法重命名我们的列-通过使用lambda，我们简单地获取列表的最终条目，该列表是通过在有新行的情况下拆分每个列名称而获得的.如果没有新行，则列名不变.
附加到完整表格"

Add an extra column to the dataframes containing the relevant sheetname
Use the rename method to rename our columns - by using a lambda, we simply take the final entry of the list obtained by splitting each column name any time there is a new line. If there is no new line, the column name is unchanged.
Append to the "full table"

完成此操作后，我们将重置索引，并且一切都会好起来.注意:如果您在一个工作表上有聚会，但在其他工作表上却没有，则仍然可以使用，但是会用NaN填充每张工作表的所有缺失列.

Once this is done, we reset the index and all should be well. Note: if you have parties present on one sheet but not others, this will still work but will fill any missing columns for each sheet with NaN.

import pandas as pd

sheets_dict = pd.read_excel('Book1.xlsx', sheetname=None)

full_table = pd.DataFrame()
for name, sheet in sheets_dict.items():
    sheet['sheet'] = name
    sheet = sheet.rename(columns=lambda x: x.split('\n')[-1])
    full_table = full_table.append(sheet)

full_table.reset_index(inplace=True, drop=True)

print full_table

打印:

    area  cnt  party1  party2   sheet
0  bacon    9       5       5  Sheet1
1   spam    3       7       5  Sheet1
2   eggs    2      18       4  Sheet2

这篇关于通过Excel工作表的Python循环，放入一个df的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

通过Excel工作表的Python循环，放入一个df [英] Python Loop through Excel sheets, place into one df

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

通过Excel工作表的Python循环，放入一个df [英] Python Loop through Excel sheets, place into one df

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭