将数据帧的unicode数据转换为字符串 [英] unicode datas of a dataframe to strings

查看:91
本文介绍了将数据帧的unicode数据转换为字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对通过读取xls文件获得的数据帧有一些麻烦. 这样的数据帧上的每个数据都具有"unicode"类型,对此我无法做任何事情.我想将其更改为str值.另外,如果可能的话,我想知道这个事实的原因.我听说过一些有关外部数据"的信息,而且我知道列和索引在它们的名称之前也都显示了unicode的"u".我对编码几乎一无所知,如果有人另外解释一下,我将不胜感激.

I have some troubles with a dataframe obtained from reading a xls file. Every data on such dataframe has the type 'unicode' and I can't do anything with this. I wanna change it to str values. Also, iff possible, I'd like to know the reason of this fact. I heard something about 'external data', and I know that both columns and index also present the 'u' of unicode before the names of these ones. I don't know neither almost anything about encoding and I would be really grateful if someone explains something about this in addition.

我正在使用Python 2,并且尝试使用功能

I'm using Python 2 and I tryed to solve it column by column with functions as

.astype(str) 
.astype(basestring)
.apply(str) 

.str.decode('iso-8859-1').str.encode('utf-8') 

(我在这里读到了最后一个,我只是在代码中写了它以尝试另一件事).我也尝试过

(I read this last one here and I just wrote it in my code to try another thing). I also tried

unicodedata.normalize('NFKD', df_bolsa[l]).encode('ascii','ignore')

,但最后一个不能与系列一起使用. 我希望有人能够帮助我澄清这个问题. 提前非常感谢您!

but this last one cannot be used with a series. I hope someone to be able to help me to clarify this matter. Thank you very much in advance!!

推荐答案

您可以使用以下代码.

for column in df:
    df[column] = df_peru[column].str.encode('utf-8')

这篇关于将数据帧的unicode数据转换为字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆