Pandas:将列转换为字符串不起作用 [英] Pandas: Cast column to string does not work

查看:37
本文介绍了Pandas:将列转换为字符串不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框 resultstatsDF

resultstatsDF = DataFrame({'a': [1,2,3,4,5]})resultstatsDF['file'] = 'asdf'resultstatsDF.dtypes一个 int64文件对象数据类型:对象

使用我想转换为字符串的 objectfile:

我试过了

resultstatsDF = resultstatsDF.astype({'file': str})resultstatsDF['file'] = resultstatsDF['file'].astype(str)resultstatsDF['file'] = resultstatsDF['file'].to_stringresultstatsDF['file'] = resultstatsDF.file.apply(str)resultstatsDF['file'] = resultstatsDF['file'].apply(str)

但无论我做什么,当我检查

resultstatsDF.dtypes

file 保持为 tpye object.

解决方案

stringdictlist

dtypecode> 总是 object,为了测试 type 需要选择一些列的值,例如来自 iat:

type(resultstatsDF['file'].iat[0])

示例:

resultstatsDF = pd.DataFrame({'file':['a','d','f']})打印(resultstatsDF)文件0个1天2 英尺打印(类型(resultstatsDF['file'].iloc[0]))<类'str'>打印 (resultstatsDF['file'].apply(type))0 1 <类'str'>2 <类'str'>名称:文件,数据类型:对象

示例:

df = pd.DataFrame({'strings':['a','d','f'],'dicts':[{'a':4}, {'c':8}, {'e':9}],'列表':[[4,8],[7,8],[3]],'元组':[(4,8),(7,8),(3,)],'集合':[集合([1,8]),集合([7,3]),集合([0,1])] })打印 (df)dicts 列表设置字符串元组0 {'a': 4} [4, 8] {8, 1} a (4, 8)1 {'c': 8} [7, 8] {3, 7} d (7, 8)2 {'e': 9} [3] {0, 1} f (3,)

所有值都具有相同的dtypes:

print (df.dtypes)dicts 对象列出对象设置对象字符串对象元组对象数据类型:对象

但是type不同,如果需要循环检查:

 用于 df 中的 col:打印(df[col].apply(类型))0 1 2 <class 'dict'>名称:字典,数据类型:对象0 <类'列表'>1 <类列表">2 <类'列表'>名称:列表,数据类型:对象0 <类'设置'>1 <类集合">2 <类集合">名称:集合,数据类型:对象0 1 <类'str'>2 <类'str'>名称:字符串,数据类型:对象0 <类'元组'>1 <类'元组'>2 <类'元组'>名称:元组,数据类型:对象

或列的第一个值:

print (type(df['strings'].iat[0]))<类'str'>打印(类型(df['dicts'].iat[0]))<类'dict'>打印(类型(df['lists'].iat[0]))<类'列表'>打印(类型(df['元组'].iat[0]))<类'元组'>打印(类型(df['sets'].iat[0]))<类设置">

使用 布尔索引 如果可能的混合列(那么一些 Pandas 功能可能会被破坏)可以通过 type 过滤:

df = pd.DataFrame({'mixed':['3', 5, 9,'2']})打印 (df)混合的0 31 52 93 2打印(df.dtypes)混合对象数据类型:对象

<小时>

 用于 df 中的 col:打印(df[col].apply(类型))0 1 <类'int'>2 3 <class 'str'>名称:混合,数据类型:对象#python 3 - 字符串#python 2 - 基本字符串mask = df['mixed'].apply(lambda x: isinstance(x,str))打印(面具)0 真1 错误2 错误3 真名称:混合,数据类型:booldf = df[掩码]打印 (df)混合的0 33 2

I have a dataframe resultstatsDF

resultstatsDF = DataFrame({'a': [1,2,3,4,5]})
resultstatsDF['file'] = 'asdf'
resultstatsDF.dtypes
a        int64
file    object
dtype: object

with the object column file that I would like to cast to string:

I tried

resultstatsDF = resultstatsDF.astype({'file': str})
resultstatsDF['file'] = resultstatsDF['file'].astype(str)
resultstatsDF['file'] = resultstatsDF['file'].to_string
resultstatsDF['file'] = resultstatsDF.file.apply(str)
resultstatsDF['file'] = resultstatsDF['file'].apply(str)

but whatever I do, when I check with

resultstatsDF.dtypes

the column file stays to be of tpye object.

解决方案

dtype of string, dict, list is always object, for testing type need select some value of column e.g. by iat:

type(resultstatsDF['file'].iat[0])

Sample:

resultstatsDF = pd.DataFrame({'file':['a','d','f']})
print (resultstatsDF)
  file
0    a
1    d
2    f

print (type(resultstatsDF['file'].iloc[0]))
<class 'str'>

print (resultstatsDF['file'].apply(type))
0    <class 'str'>
1    <class 'str'>
2    <class 'str'>
Name: file, dtype: object

Sample:

df = pd.DataFrame({'strings':['a','d','f'],
                   'dicts':[{'a':4}, {'c':8}, {'e':9}],
                   'lists':[[4,8],[7,8],[3]],
                   'tuples':[(4,8),(7,8),(3,)],
                   'sets':[set([1,8]), set([7,3]), set([0,1])] })

print (df)
      dicts   lists    sets strings  tuples
0  {'a': 4}  [4, 8]  {8, 1}       a  (4, 8)
1  {'c': 8}  [7, 8]  {3, 7}       d  (7, 8)
2  {'e': 9}     [3]  {0, 1}       f    (3,)

All values have same dtypes:

print (df.dtypes)
dicts      object
lists      object
sets       object
strings    object
tuples     object
dtype: object

But type is different, if need check it by loop:

for col in df:
    print (df[col].apply(type))

0    <class 'dict'>
1    <class 'dict'>
2    <class 'dict'>
Name: dicts, dtype: object
0    <class 'list'>
1    <class 'list'>
2    <class 'list'>
Name: lists, dtype: object
0    <class 'set'>
1    <class 'set'>
2    <class 'set'>
Name: sets, dtype: object
0    <class 'str'>
1    <class 'str'>
2    <class 'str'>
Name: strings, dtype: object
0    <class 'tuple'>
1    <class 'tuple'>
2    <class 'tuple'>
Name: tuples, dtype: object

Or first value of columns:

print (type(df['strings'].iat[0]))
<class 'str'>

print (type(df['dicts'].iat[0]))
<class 'dict'>

print (type(df['lists'].iat[0]))
<class 'list'>

print (type(df['tuples'].iat[0]))
<class 'tuple'>

print (type(df['sets'].iat[0]))
<class 'set'>

With boolean indexing if possible mixed column (then some pandas function can be broken) is possible filter by type:

df = pd.DataFrame({'mixed':['3', 5, 9,'2']})
print (df)
  mixed
0     3
1     5
2     9
3     2

print (df.dtypes)
mixed    object
dtype: object


for col in df:
    print (df[col].apply(type))
0    <class 'str'>
1    <class 'int'>
2    <class 'int'>
3    <class 'str'>
Name: mixed, dtype: object

#python 3 - string
#python 2 - basestring
mask = df['mixed'].apply(lambda x: isinstance(x,str))
print (mask)
0     True
1    False
2    False
3     True
Name: mixed, dtype: bool

df = df[mask]
print (df)
  mixed
0     3
3     2

这篇关于Pandas:将列转换为字符串不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆