无法使用 pandas 读取xlsb文件 [英] Unable to read xlsb file using pandas

查看:300
本文介绍了无法使用 pandas 读取xlsb文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用pandas的read_excel从本地读取xlsb文件,但出现错误. 我的代码:

I am trying to read an xlsb file from local using pandas' read_excel but I am getting error. My code:

import pandas as pd
df3 = pd.read_excel('a.xlsb', engine = 'pyxlsb')


错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-06db88cb2446> in <module>
----> 1 pd.read_excel('a.xlsb', engine='pyxlsb')

/usr/local/lib/python3.5/dist-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    186                 else:
    187                     kwargs[new_arg_name] = new_arg_value
--> 188             return func(*args, **kwargs)
    189         return wrapper
    190     return _deprecate_kwarg

/usr/local/lib/python3.5/dist-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    186                 else:
    187                     kwargs[new_arg_name] = new_arg_value
--> 188             return func(*args, **kwargs)
    189         return wrapper
    190     return _deprecate_kwarg

/usr/local/lib/python3.5/dist-packages/pandas/io/excel.py in read_excel(io, sheet_name, header, names, index_col, parse_cols, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, verbose, parse_dates, date_parser, thousands, comment, skip_footer, skipfooter, convert_float, mangle_dupe_cols, **kwds)
    348 
    349     if not isinstance(io, ExcelFile):
--> 350         io = ExcelFile(io, engine=engine)
    351 
    352     return io.parse(

/usr/local/lib/python3.5/dist-packages/pandas/io/excel.py in __init__(self, io, engine)
    644             engine = 'xlrd'
    645         if engine not in self._engines:
--> 646             raise ValueError("Unknown engine: {engine}".format(engine=engine))
    647 
    648         # could be a str, ExcelFile, Book, etc.

ValueError: Unknown engine: pyxlsb

它对于csv和xlsx文件正常工作.

It works fine for csv and xlsx files.

python版本:3.5.2
熊猫版本:0.24.2

python version: 3.5.2
pandas version: 0.24.2

推荐答案

进一步研究问题并参考@Datanovice的注释后,如果我更新到pandas v1.0,则对我有用. 我正在使用ubuntu 16.04,它可以自动将我的python更新到3.5,不再更新,并且python 3.6支持pandas v1.0.因此,即使使用最新版本进行更新,我也无法运行代码. 我们可以安装python 3.6并为此安装pandas v1.0.

After looking into the problem a bit more and referring to @Datanovice 's comment, it works for me if I update to pandas v1.0. I am using ubuntu 16.04 which can automatically update my python to 3.5, not any further and pandas v1.0 is supported from python 3.6. Hence, even after updating with the latest versions, I was not able to run the code. We can install python 3.6 and install pandas v1.0 for that.

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.6

使用pandas 3.6,我们只需将引擎作为pyxlsb传递给read_excel即可读取文件.

Using pandas 3.6, we can simply pass the engine as pyxlsb to read_excel to read the file.

import pandas as pd
df3 = pd.read_excel('a.xlsb', engine = 'pyxlsb')

在Ubuntu 16.04上安装python3.6的参考: https://askubuntu.com/questions/865554/how-do-i-install-python-3-6-using-apt-get

Reference to install python3.6 on Ubuntu 16.04: https://askubuntu.com/questions/865554/how-do-i-install-python-3-6-using-apt-get

这篇关于无法使用 pandas 读取xlsb文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆