大 pandas 组合Excel电子表格 [英] pandas Combine Excel Spreadsheets

查看：313 发布时间：2017/9/8 22:54:29 python excel

本文介绍了大 pandas 组合Excel电子表格的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个带有多个选项卡的Excel工作簿。
每个标签与所有其他选项卡具有相同的标题集。
我想将每个选项卡的所有数据合并到一个数据框架中（不重复每个选项卡的标题）。

到目前为止，尝试：

 导入熊猫为pd 
 xl = pd.ExcelFile（'file.xlsx'）
 df = xl.parse（）

可以使用一些解析参数来表示所有电子表格？
或者这是错误的方法吗？

提前感谢！

更新：

  a = xl.sheet_names 
b = pd.DataFrame（）
 for i in a：
 b.append（xl.parse（i））
b

但它不是工作。

解决方案

这是一种方法 - 将所有表格加载到数据框的字典中，然后连接所有字典中的值转换为一个数据框。

 将大熊猫导入为pd

将sheetname设置为None，以便将所有表格加载到数据框
中，忽略索引以避免稍后重叠的值（见@bunji的评论）

  df = pd.read_excel（'tmp.xlsx'，sheetname = None，ignore_index = True）

然后连接所有数据框

  cdf = pd.concat（df.values（））
 
 print（cdf）

I have an Excel workbook with many tabs. Each tab has the same set of headers as all others. I want to combine all of the data from each tab into one data frame (without repeating the headers for each tab).

So far, I've tried:

import pandas as pd
xl = pd.ExcelFile('file.xlsx')
df = xl.parse()

Can use something for the parse argument that will mean "all spreadsheets"? Or is this the wrong approach?

Thanks in advance!

Update: I tried:

a=xl.sheet_names
b = pd.DataFrame()
for i in a:
    b.append(xl.parse(i))
b

But it's not "working".

解决方案

This is one way to do it -- load all sheets into a dictionary of dataframes and then concatenate all the values in the dictionary into one dataframe.

import pandas as pd

Set sheetname to None in order to load all sheets into a dict of dataframes and ignore index to avoid overlapping values later (see comment by @bunji)

df = pd.read_excel('tmp.xlsx', sheetname=None, ignore_index=True)

Then concatenate all dataframes

cdf = pd.concat(df.values())

print(cdf)

这篇关于大 pandas 组合Excel电子表格的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

大 pandas 组合Excel电子表格 [英] pandas Combine Excel Spreadsheets

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

大 pandas 组合Excel电子表格 [英] pandas Combine Excel Spreadsheets

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭