如何串联同一文件中的多个Excel工作表? [英] how to concatenate multiple excel sheets from the same file?
本文介绍了如何串联同一文件中的多个Excel工作表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个很大的Excel文件,其中包含许多不同的工作表。所有工作表都具有相同的结构,例如:
I have a big excel file that contains many different sheets. All the sheets have the same structure like:
Name
col1 col2 col3 col4
1 1 2 4
4 3 2 1
- 如何连接(垂直) ),而不必手动命名每个熊猫
glob
获取目录中的文件列表。但是在这里,对于excel表格,我迷路了。 - 是否有一种方法可以在结果数据框中创建一个变量,以标识数据所来自的表格名称?
- How can I concatenate (vertically) all these sheets in
Pandas
without having to name each of them manually? If these were files, I could useglob
to obtain a list of files in a directory. But here, for excel sheets, I am lost. - Is there a way to create a variable in the resulting dataframe that identifies the sheet name from which the data comes from?
谢谢!
推荐答案
尝试一下:
dfs = pd.read_excel(filename, sheetname=None, skiprows=1)
这将返回DF字典,您可以使用 pd.concat(dfs)$ c $轻松地将其连接起来c>或@jezrael已在他的答案中发布:
this will return you a dictionary of DFs, which you can easily concatenate using pd.concat(dfs)
or as @jezrael has already posted in his answer:
df = pd.concat(pd.read_excel(filename, sheetname=None, skiprows=1))
工作表名称:无->全部表格作为DataFrames的字典
sheetname: None -> All sheets as a dictionary of DataFrames
更新:
是否可以在结果数据框中创建一个变量,以使
标识出数据的工作表名称?
Is there a way to create a variable in the resulting dataframe that identifies the sheet name from which the data comes from?
dfs = pd.read_excel(filename, sheetname=None, skiprows=1)
假设我们有以下命令:
In [76]: dfs
Out[76]:
{'d1': col1 col2 col3 col4
0 1 1 2 4
1 4 3 2 1, 'd2': col1 col2 col3 col4
0 3 3 4 6
1 6 5 4 3}
现在我们可以添加一个新的列:
Now we can add a new column:
In [77]: pd.concat([df.assign(name=n) for n,df in dfs.items()])
Out[77]:
col1 col2 col3 col4 name
0 1 1 2 4 d1
1 4 3 2 1 d1
0 3 3 4 6 d2
1 6 5 4 3 d2
这篇关于如何串联同一文件中的多个Excel工作表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文