使用Python读取XLS文件时出现错误(little-endian) [英] Error (little-endian) reading a XLS file with python

查看:205
本文介绍了使用Python读取XLS文件时出现错误(little-endian)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用硒从网上下载了XLS文件.

I download a XLS file from the web using selenium.

我尝试了在堆栈溢出和其他网站中找到的许多选项来读取XLS文件:

I tried many options I found in stack-overflow and other websites to read the XLS file :

import pandas as pd
df = pd.read_excel('test.xls') # Read XLS file
Expected "little-endian" marker, found b'\xff\xfe'

还有

df = pd.ExcelFile('test.xls').parse('Sheet1') # Read XLSX file
Expected "little-endian" marker, found b'\xff\xfe'

再来一次

from xlrd import open_workbook
book = open_workbook('test.xls') 
CompDocError: Expected "little-endian" marker, found b'\xff\xfe'

我尝试了不同的编码:utf-8,ANSII,utf_16_be,utf16 我什至尝试从记事本或其他应用程序中获取文件的编码.

I have tried different encoding: utf-8, ANSII, utf_16_be, utf16 I have even tried to get the encoding of the file from notepad or other applications.

文件类型:Microsoft Excel 97-2003工作表(.xls) 我可以使用Excel打开文件而没有任何问题. 令人沮丧的是,如果我使用excel打开文件并按保存,则可以使用上一个python命令读取文件.

Type of file : Microsoft Excel 97-2003 Worksheet (.xls) I can open the file with Excel without any issue. What's frustrating is that if I open the file with excel and just press save I then can read the file with of the previous python command.

如果有人可以向我提供其他可以尝试的想法,我将不胜感激.我只需要使用python脚本打开此文件.

I would be really grateful if someone could provide me other ideas I could try. I need to open this file with a python script only.

谢谢, 最高

解决方案(有点混乱,但很简单),可能适用于任何类型的Excel文件:

Solution(Somewhat messy but simple) that could potentially work for any type of Excel file :

从python调用VBA来打开文件并将其保存在Excel中. Excel清理"文件,然后Python可以使用任何读取的Excel类型函数读取文件

Called VBA from python to Open and save the file in Excel. Excel "clean-up" the file and then Python is able to read it with any read Excel type function

受@Serge Ballesta和@John Y评论启发的解决方案.

Solution inspired by @Serge Ballesta and @John Y comments.

## Open a file in Excel and save it to correct the encoding error 
import win32com.client
import pandas

downloadpath="c:\\firefox_downloads\\"
filename="myfile.xls"

xl=win32com.client.Dispatch("Excel.Application")
xl.Application.DisplayAlerts = False # disables Excel pop up message (for saving the file)
wb = xl.Workbooks.Open(Filename=downloadpath+filename)
wb.SaveAs(downloadpath+filename)
wb.Close
xl.Application.DisplayAlerts = True  # enables Excel pop up message for saving the file

df = pandas.ExcelFile(downloadpath+filename).parse('Sheet1') # Read XLSX file

谢谢大家!

推荐答案

pd 是什么意思?

熊猫用于数据科学.在我看来,您必须使用 openpyxl (仅读取和写入xlsx)或 xlwt/xlrd (读取xls ...并仅写入xls).

pandas is made for data science. In my opinion, you have to use openpyxl (read and write only xlsx) or xlwt/xlrd (read xls... and write only xls).

from xlrd import open_workbook
book = open_workbook(<math file>)
sheet =.... 

在Internet上有几个示例...

It has several examples with this on Internet...

这篇关于使用Python读取XLS文件时出现错误(little-endian)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆