使用Python pandas 时编码/解码非ASCII字符 [英] Encoding/decoding non-ASCII character when using Python Pandas

查看：180 发布时间：2020/9/7 20:38:41 python-2.7 pandas character-encoding ascii non-ascii-characters

本文介绍了使用Python pandas 时编码/解码非ASCII字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一些带有非ASCII字符的数据.我尝试使用以下方法来解决它:

I have some data with non-ASCII characters. I attempted to take care of it using the following:

# coding=utf-8
import pandas as pd
from pandas import DataFrame, Series
import sys
import re
reload(sys)
sys.setdefaultencoding('latin1')

尽管我发现一些记录仍然给我编码/解码问题.我已经复制并粘贴了其中一个有问题的记录(包含记录的名称和位置列)，如下所示:

Though I have identified some records still giving me encoding/decoding problem. I have copied and pasted one of the problematic record (containing the name and location columns of the record) as below:

'EugÃ¨ne Badeau'    'E, QuÃ©bec (county/comtÃ©), Quebec, Canada'

使用.decode('utf-8')添加到准确的文本提取中，可以解决此问题.

Using the .decode('utf-8') adding to the exact text extraction it resolved the problem.

print 'EugÃ¨ne Badeau   E, QuÃ©bec (county/comtÃ©), Quebec, Canada'.decode('utf-8')
output: Eugène Badeau   E, Québec (county/comté), Quebec, Canada

所以我尝试用它来转换我的pandas列:

So I try to use it to convert my pandas column:

df.name = df.name.str.encode('utf-8')

位置似乎还可以，但名称仍然错误:

The location seems to be ok, but the name is still wrong:

print df.location[735]
print df.name[735]

output:
E, Québec (county/comté), Quebec, Canada
eugã¨ne badeau

推荐答案

您可以结合unidecode lib一起申请:

You could do apply combined with unidecode lib:

from unidecode import unidecode

df['name']=df['name'].apply( lambda x:  unidecode(unicode(x, encoding = "utf-8")))
df['location']=df['location'].apply( lambda x:  unidecode(unicode(x, encoding = "utf-8")))

;)

这篇关于使用Python pandas 时编码/解码非ASCII字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python pandas 时编码/解码非ASCII字符 [英] Encoding/decoding non-ASCII character when using Python Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python pandas 时编码/解码非ASCII字符 [英] Encoding/decoding non-ASCII character when using Python Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭