使用 Pandas 在 python 中读取 Excel 文件 [英] Reading an Excel file in python using pandas
问题描述
我正在尝试以这种方式读取 excel 文件:
newFile = pd.ExcelFile(PATHFileName.xlsx)ParsedData = pd.io.parsers.ExcelFile.parse(newFile)
抛出一个错误,指出预期有两个参数,我不知道第二个参数是什么,而且我在这里试图实现的是将 Excel 文件转换为 DataFrame,我这样做是否正确?或者有没有其他方法可以使用熊猫来做到这一点?
Close:首先调用 ExcelFile
,然后调用 .parse
方法并将其传递给工作表名称.
您正在做的是调用存在于类本身而不是实例上的方法,这没问题(虽然不是很惯用),但是如果您这样做,您还需要传递工作表名称:
<预><代码>>>>解析 = pd.io.parsers.ExcelFile.parse(xl, "Sheet1")>>>已解析的列索引([u'Tid', u'dummy1', u'dummy2', u'dummy3', u'dummy4', u'dummy5', u'dummy6', u'dummy7', u'dummy8', u'dummy9'], dtype=object)I am trying to read an excel file this way :
newFile = pd.ExcelFile(PATHFileName.xlsx)
ParsedData = pd.io.parsers.ExcelFile.parse(newFile)
which throws an error that says two arguments expected, I don't know what the second argument is and also what I am trying to achieve here is to convert an Excel file to a DataFrame, Am I doing it the right way? or is there any other way to do this using pandas?
Close: first you call ExcelFile
, but then you call the .parse
method and pass it the sheet name.
>>> xl = pd.ExcelFile("dummydata.xlsx")
>>> xl.sheet_names
[u'Sheet1', u'Sheet2', u'Sheet3']
>>> df = xl.parse("Sheet1")
>>> df.head()
Tid dummy1 dummy2 dummy3 dummy4 dummy5
0 2006-09-01 00:00:00 0 5.894611 0.605211 3.842871 8.265307
1 2006-09-01 01:00:00 0 5.712107 0.605211 3.416617 8.301360
2 2006-09-01 02:00:00 0 5.105300 0.605211 3.090865 8.335395
3 2006-09-01 03:00:00 0 4.098209 0.605211 3.198452 8.170187
4 2006-09-01 04:00:00 0 3.338196 0.605211 2.970015 7.765058
dummy6 dummy7 dummy8 dummy9
0 0.623354 0 2.579108 2.681728
1 0.554211 0 7.210000 3.028614
2 0.567841 0 6.940000 3.644147
3 0.581470 0 6.630000 4.016155
4 0.595100 0 6.350000 3.974442
What you're doing is calling the method which lives on the class itself, rather than the instance, which is okay (although not very idiomatic), but if you're doing that you would also need to pass the sheet name:
>>> parsed = pd.io.parsers.ExcelFile.parse(xl, "Sheet1")
>>> parsed.columns
Index([u'Tid', u'dummy1', u'dummy2', u'dummy3', u'dummy4', u'dummy5', u'dummy6', u'dummy7', u'dummy8', u'dummy9'], dtype=object)
这篇关于使用 Pandas 在 python 中读取 Excel 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!