无法将 pandas 数据框导出到Excel/编码 [英] Can't export pandas dataframe to excel / encoding

查看:103
本文介绍了无法将 pandas 数据框导出到Excel/编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于编码困难,我无法导出其中一个数据框.

I'm unable to export one of my dataframes due to some encoding difficulty.

sjM.dtypes

Customer Name              object
Total Sales               float64
Sales Rank                float64
Visit_Frequency           float64
Last_Sale          datetime64[ns]
dtype: object

csv导出效果很好

path = 'c:\\test'
sjM.to_csv(path + '.csv')   # Works

但是excel导出失败

but the excel export fails

sjM.to_excel(path + '.xls')

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "testing.py", line 338, in <module>
    sjM.to_excel(path + '.xls')
  File "c:\Anaconda\Lib\site-packages\pandas\core\frame.py", line 1197, in to_excel
    excel_writer.save()
  File "c:\Anaconda\Lib\site-packages\pandas\io\excel.py", line 595, in save
    return self.book.save(self.path)
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 662, in save
    doc.save(filename, self.get_biff_data())
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 637, in get_biff_data
    shared_str_table   = self.__sst_rec()
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 599, in __sst_rec
    return self.__sst.get_biff_record()
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 76, in get_biff_record
    self._add_to_sst(s)
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 91, in _add_to_sst
    u_str = upack2(s, self.encoding)
  File "c:\Anaconda\Lib\site-packages\xlwt\UnicodeUtils.py", line 50, in upack2
    us = unicode(s, encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x81 in position 22: ordinal not in range(128)

我知道问题出在客户名称"列,因为删除后导出到excel效果很好.

I know that the problem is coming from the 'Customer Name' column, as after deletion the export to excel works fine.

我已尝试按照该问题的建议进行操作( Python pandas to_excel'utf8'编解码器无法解码字节),使用函数对有问题的列进行解码和重新编码

I've tried following advice from that question (Python pandas to_excel 'utf8' codec can't decode byte), using a function to decode and re-encode the offending column

def changeencode(data):
    cols = data.columns
    for col in cols:
        if data[col].dtype == 'O':
            data[col] = data[col].str.decode('latin-1').str.encode('utf-8')
    return data

sJM = changeencode(sjM)

sjM['Customer Name'].str.decode('utf-8')

L2-00864                         SETIA 2
K1-00279                     BERKAT JAYA
L2-00664                        TK. ANTO
BR00035                   BRASIL JAYA,TK
RA00011               CV. RAHAYU SENTOSA

所以转换为Unicode似乎成功

so the conversion to unicode appears to be successful

sjM.to_excel(path + '.xls')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\Anaconda\Lib\site-packages\pandas\core\frame.py", line 1197, in to_excel
    excel_writer.save()
  File "c:\Anaconda\Lib\site-packages\pandas\io\excel.py", line 595, in save
    return self.book.save(self.path)
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 662, in save
    doc.save(filename, self.get_biff_data())
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 637, in get_biff_data
    shared_str_table   = self.__sst_rec()
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 599, in __sst_rec
    return self.__sst.get_biff_record()
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 76, in get_biff_record
    self._add_to_sst(s)
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 91, in _add_to_sst
    u_str = upack2(s, self.encoding)
  File "c:\Anaconda\Lib\site-packages\xlwt\UnicodeUtils.py", line 50, in upack2
    us = unicode(s, encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 22: ordinal not in range(128)

  1. 即使转换为Unicode似乎成功,为什么也会失败?
  2. 如何解决此问题以将该数据框导出为ex​​cel?

@杰夫

感谢您向我展示正确的方向

Thanks for showing me the right direction

使用的步骤:

安装xlsxwriter(不与熊猫捆绑)

install xlsxwriter (not bundled with pandas)

sjM.to_excel(path + '.xlsx', sheet_name='Sheet1', engine='xlsxwriter')

推荐答案

您需要使用pandas> = 0.13,而xlsxwriter引擎用于excel,该引擎支持本机unicode编写. xlwt,默认引擎将支持传递编码选项,该选项将在0.14中可用.

You need to use pandas >= 0.13, and the xlsxwriter engine for excel, which supports native unicode writing. xlwt, the default engine will support passing an encoding option will be available in 0.14.

有关引擎文档,请参见此处

see here for the engine docs.

这篇关于无法将 pandas 数据框导出到Excel/编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆