str()和astype(str)之间的区别? [英] Difference between str() and astype(str)?

查看:2625
本文介绍了str()和astype(str)之间的区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将数据帧df保存到.h5文件MainDataFile.h5中:

I want to save the dataframe df to the .h5 file MainDataFile.h5 :

df.to_hdf ("c:/Temp/MainDataFile.h5", "MainData", mode = "w", format = "table", data_columns=['_FirstDayOfPeriod','Category','ChannelId'])

并出现以下错误:

***例外:找不到正确的原子类型->> [dtype-> object,items-> Index(['Libellé_Article','Libellé_segment'],dtype ='object')]

*** Exception: cannot find the correct atom type -> > [dtype->object,items->Index(['Libellé_Article', 'Libellé_segment'], dtype='object')]

如果我以此方式修改Libellé_Article"列:

If I modifify the column 'Libellé_Article' in this way :

df['Libellé_Article'] = str(df['Libellé_Article'])

没有错误了,而我在执行操作时仍然收到错误消息:

there is no error anymore, whereas I still get the error message when doing :

df['Libellé_Article'] = df['Libellé_Article'].astype(str)

问题是使用str()炸毁了我的内存.

The problem is that using str() is blowing up my ram.

有什么主意吗?

推荐答案

str(df['Libellé_Article'])会将整个列的内容转换为单个字符串.它将以很大的字符串结尾.那就是造成RAM耗尽的原因

str(df['Libellé_Article']) will convert the contents of the entire column in to single string. It will end up with a very big string. And thats the reason for blowing up your RAM

例如

>> df = pd.DataFrame([1,2,3], columns=['A'])
>> df['A']
0    1
1    2
2    3 
Name: A, dtype: int64

>> str(df['A'])
 '0    1\n1    2\n2    3\nName: A, dtype: int64'
>> df['A'].astype(str)
0    1
1    2
2    3
Name: A, dtype: object

因此,如果要将整个列都转换为字符串类型,则应仅使用.astype(str)

So you should use .astype(str) only, if you want to convert your entire column to type string

这篇关于str()和astype(str)之间的区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆