pandas read_excel有时即使index_col = None也会创建索引 [英] Pandas read_excel sometimes creates index even when index_col=None
问题描述
我正试图将excel文件读入数据框中,我想稍后设置索引,所以我不希望熊猫将列0用作索引值.
I'm trying to read an excel file into a data frame and I want set the index later, so I don't want pandas to use column 0 for the index values.
默认情况下(index_col=None
),它不应使用列0作为索引,但是我发现,如果工作表的A1单元格中没有没有值,则它将使用.
By default (index_col=None
), it shouldn't use column 0 for the index but I find that if there is no value in cell A1 of the worksheet it will.
是否有任何方法可以克服此问题(我正在加载许多在单元格A1中没有值的工作表)?
Is there any way to over-ride this behaviour (I am loading many sheets that have no value in cell A1)?
当test1.xlsx在单元格A1中具有值"DATE"时,这将按预期工作:
This works as expected when test1.xlsx has the value "DATE" in cell A1:
In [19]: pd.read_excel('test1.xlsx')
Out[19]:
DATE A B C
0 2018-01-01 00:00:00 0.766895 1.142639 0.810603
1 2018-01-01 01:00:00 0.605812 0.890286 0.810603
2 2018-01-01 02:00:00 0.623123 1.053022 0.810603
3 2018-01-01 03:00:00 0.740577 1.505082 0.810603
4 2018-01-01 04:00:00 0.335573 -0.024649 0.810603
但是当工作表在单元格A1中没有值时,它会自动将第0列的值分配给索引:
But when the worksheet has no value in cell A1, it automatically assigns column 0 values to the index:
In [20]: pd.read_excel('test2.xlsx', index_col=None)
Out[20]:
A B C
2018-01-01 00:00:00 0.766895 1.142639 0.810603
2018-01-01 01:00:00 0.605812 0.890286 0.810603
2018-01-01 02:00:00 0.623123 1.053022 0.810603
2018-01-01 03:00:00 0.740577 1.505082 0.810603
2018-01-01 04:00:00 0.335573 -0.024649 0.810603
这不是我想要的.
所需结果:与第一个示例相同(但列标签可能带有未命名").
Desired result: Same as first example (but with 'Unnamed' as the column label perhaps).
文档说
index_col:int,int列表,默认为无.
index_col : int, list of int, default None.
列(0索引),用作DataFrame的行标签.如果没有这样的列,则不传递任何值.
Column (0-indexed) to use as the row labels of the DataFrame. Pass None if there is no such column.
推荐答案
您正在描述的问题与一个已知的熊猫bug匹配.此错误已在最新的 pandas 0.24.0版本中修复:
The issue that you're describing matches a known pandas bug. This bug was fixed in the recent pandas 0.24.0 release:
错误修复
- 查看 read_excel() ,其中不尊重
index_col=None
并始终解析索引列( GH18792 , GH20480 )
- Bug in read_excel() in which
index_col=None
was not being respected and parsing index columns anyway (GH18792, GH20480)
这篇关于 pandas read_excel有时即使index_col = None也会创建索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!