使用 pandas read_excel(header = [0,1])时出错 [英] Error when using pandas read_excel(header=[0,1])

查看:60
本文介绍了使用 pandas read_excel(header = [0,1])时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用熊猫read_excel处理文件.该文件具有两列标题,因此我尝试使用除header关键字参数之外的multiIndex功能.

I'm trying to use pandas read_excel to work with a file. The file has two columns of headers so I'm trying to use the multiIndex feature apart of the header keyword argument.

import pandas as pd, os 

"""data in 2015 MOR Folder"""
filename = 'MOR-JANUARY 2015.xlsx'

print(os.path.isfile(filename))

df1 = pd.read_excel(filename, header=[0,1], sheetname='MOR')

print(df1)

我得到的错误是ValueError:新名称的长度必须为1,为2.文件在此google驱动器文件夹中使用熊猫阅读具有多个标题的Excel工作表

the error I get is ValueError: Length of new names must be 1, got 2. The file is in this google drive folder https://drive.google.com/drive/folders/0B0ynKIVAlSgidFFySWJoeFByMDQ?usp=sharing I'm trying to follow the solution posted here Read excel sheet with multiple header using Pandas

推荐答案

我可能会误会,但我不认为pandas会在合并单元格的情况下处理excel行的解析.因此,在第一行中,合并的单元格被解析为大部分为空单元格.您需要很好地重复它们才能正确执行操作.这就是激发ffill的原因.如果您可以提前控制Excel工作簿,并且可以使用已有的代码.

I could be mistaken but I don't think pandas handles parsing excel rows where there are merged cells. So in that first row, the merged cells get parsed as mostly empty cells. You'd need them nicely repeated to act correctly. This is what motivates the ffill below. If you could control the Excel workbook ahead of time and you might be able to use the code you have.

我的解决方案

my solution

这不漂亮,但是会解决的.

It's not pretty, but it'll get it done.

filename = 'MOR-JANUARY 2015.xlsx'
df1 = pd.read_excel(filename, sheetname='MOR', header=None)

vals = df1.values

mux = pd.MultiIndex.from_arrays(df1.ffill(1).values[:2, 1:], names=[None, 'DATE'])

df1 = pd.DataFrame(df1.values[2:, 1:], df1.values[2:, 0], mux)

这篇关于使用 pandas read_excel(header = [0,1])时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆