pandas ：保存到excel编码问题 [英] Pandas: save to excel encoding issue

查看：203 发布时间：2017/8/17 0:46:10 python excel pandas encoding utf-8

本文介绍了 pandas ：保存到excel编码问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在这里提到了类似的问题，但是没有一个建议的方法对我有用。

I have a similar problem to the one mentioned here but none of the suggested methods work for me.

我有一个中等大小的 utf-8 .csv文件很多非ASCII字符。
我从一个列中分离出一个特定值的文件，然后我想将每个获取的数据框保存为.xlsx文件，并保留字符。

I have a medium size utf-8 .csv file with a lot of non-ascii characters. I am splitting the file by a particular value from one of the columns, and then I'd like to save each of the obtained dataframes as an .xlsx file with the characters preserved.

这不起作用，因为我收到错误：

This doesn't work, as I am getting an error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 7: ordinal not in range(128)

这是我的尝试：

明确使用 xlsxwriter 引擎。这似乎没有改变任何东西。

定义一个函数（下面）来改变编码并丢弃坏的字符。这也不会改变任何东西。

Using xlsxwriter engine explicitly. This doesn't seem to change anything.
Defining a function (below) to change encoding and throw away bad characters. This also doesn't change anything.

def changeencode(data):
cols = data.columns
for col in cols:
if data[col].dtype == 'O':
    data[col] = data[col].str.decode('utf-8').str.encode('ascii', 'ignore')
return data

把所有令人反感的字符交给其他人。仍然没有效果（此更改后获得引用的错误）。

Changing by hand all the offensive chars to some others. Still no effect (the quoted error was obtained after this change).

将文件编码为 utf-16 （我认为，自从我以来是正确的编码想要能够在excel之内处理文件）也不会有帮助。

Encoding the file as utf-16 (which, I believe, is the correct encoding since I want to be able to manipulate the file from within the excel afterwards) doesn't help either.

我相信问题在文件本身（因为2和3），但我不知道如何解决它。我会感谢任何帮助。文件的开头被粘贴在下面。

I believe that the problem is in the file itself (because of 2 and 3) but I have no idea how to get around it. I'd appreciate any help. The beginning of the file is pasted below.

"Submitted","your-name","youremail","phone","miasto","cityCF","innemiasto","languagesCF","morelanguages","wiek","partnerCF","messageCF","acceptance-795","Submitted Login","Submitted From","2015-12-25 14:07:58 +00:00","Zózia kryś","test@tes.pl","4444444","Wrocław","","testujemy polskie znaki","Polski","testujemy polskie znaki","44","test","test","1","Justyna","99.111.155.132",

编辑

一些代码（其中一个版本，没有拆分部分）：

Some code (one of the versions, without the splitting part):

import pandas as pd
import string
import xlsxwriter

df = pd.read_csv('path-to-file.csv')

with pd.ExcelWriter ('test.xlsx') as writer:
                df.to_excel(writer, sheet_name = 'sheet1',engine='xlsxwriter')

pandas ：保存到excel编码问题 [英] Pandas: save to excel encoding issue

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas ：保存到excel编码问题 [英] Pandas: save to excel encoding issue

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭