使用 Pandas to pd.read_excel() 为同一工作簿的多个工作表 [英] Using Pandas to pd.read_excel() for multiple worksheets of the same workbook

查看:53
本文介绍了使用 Pandas to pd.read_excel() 为同一工作簿的多个工作表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大型电子表格文件 (.xlsx),我正在使用 python pandas 进行处理.碰巧我需要来自该大文件中两个选项卡(工作表)的数据.其中一个选项卡有大量数据,另一个只有几个方形单元格.

I have a large spreadsheet file (.xlsx) that I'm processing using python pandas. It happens that I need data from two tabs (sheets) in that large file. One of the tabs has a ton of data and the other is just a few square cells.

当我使用 pd.read_excel()任何 工作表上,在我看来,整个文件都已加载(不仅仅是我感兴趣的工作表).因此,当我使用该方法两次(每张纸一次)时,我实际上不得不忍受整个工作簿被读取两次(即使我们只使用指定的工作表).

When I use pd.read_excel() on any worksheet, it looks to me like the whole file is loaded (not just the worksheet I'm interested in). So when I use the method twice (once for each sheet), I effectively have to suffer the whole workbook being read in twice (even though we're only using the specified sheet).

如何使用 pd.read_excel() 仅加载特定工作表?

How do I only load specific sheet(s) with pd.read_excel()?

推荐答案

尝试 pd.ExcelFile:

xls = pd.ExcelFile('path_to_file.xls')
df1 = pd.read_excel(xls, 'Sheet1')
df2 = pd.read_excel(xls, 'Sheet2')

正如@HaPsantran 所指出的,整个 Excel 文件在 ExcelFile() 调用期间被读入(似乎没有办法解决这个问题).这只是让您不必在每次想要访问新工作表时读取同一个文件.

As noted by @HaPsantran, the entire Excel file is read in during the ExcelFile() call (there doesn't appear to be a way around this). This merely saves you from having to read the same file in each time you want to access a new sheet.

注意 pd.read_excel()sheet_name 参数可以是工作表的名称(如上),一个指定工作表编号的整数(例如 0,1 等)、工作表名称或索引列表,或 None.如果提供了一个列表,它会返回一个字典,其中键是工作表名称/索引,值是数据框.默认是简单地返回第一个工作表(即,sheet_name=0).

Note that the sheet_name argument to pd.read_excel() can be the name of the sheet (as above), an integer specifying the sheet number (eg 0, 1, etc), a list of sheet names or indices, or None. If a list is provided, it returns a dictionary where the keys are the sheet names/indices and the values are the data frames. The default is to simply return the first sheet (ie, sheet_name=0).

如果指定了 None,则返回 所有 工作表,作为 {sheet_name:dataframe} 字典.

If None is specified, all sheets are returned, as a {sheet_name:dataframe} dictionary.

这篇关于使用 Pandas to pd.read_excel() 为同一工作簿的多个工作表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆