Pandas:将列转换为字符串不起作用 [英] Pandas: Cast column to string does not work
问题描述
我有一个数据框 resultstatsDF
resultstatsDF = DataFrame({'a': [1,2,3,4,5]})resultstatsDF['file'] = 'asdf'resultstatsDF.dtypes一个 int64文件对象数据类型:对象
使用我想转换为字符串的 object
列 file
:
我试过了
resultstatsDF = resultstatsDF.astype({'file': str})resultstatsDF['file'] = resultstatsDF['file'].astype(str)resultstatsDF['file'] = resultstatsDF['file'].to_stringresultstatsDF['file'] = resultstatsDF.file.apply(str)resultstatsDF['file'] = resultstatsDF['file'].apply(str)
但无论我做什么,当我检查
resultstatsDF.dtypes
列 file
保持为 tpye object
.
string
、dict
、list
dtype
code> 总是 object
,为了测试 type
需要选择一些列的值,例如来自 iat
:
type(resultstatsDF['file'].iat[0])
示例:
resultstatsDF = pd.DataFrame({'file':['a','d','f']})打印(resultstatsDF)文件0个1天2 英尺打印(类型(resultstatsDF['file'].iloc[0]))<类'str'>打印 (resultstatsDF['file'].apply(type))0 1 <类'str'>2 <类'str'>名称:文件,数据类型:对象
示例:
df = pd.DataFrame({'strings':['a','d','f'],'dicts':[{'a':4}, {'c':8}, {'e':9}],'列表':[[4,8],[7,8],[3]],'元组':[(4,8),(7,8),(3,)],'集合':[集合([1,8]),集合([7,3]),集合([0,1])] })打印 (df)dicts 列表设置字符串元组0 {'a': 4} [4, 8] {8, 1} a (4, 8)1 {'c': 8} [7, 8] {3, 7} d (7, 8)2 {'e': 9} [3] {0, 1} f (3,)
所有值都具有相同的dtypes代码>
:
print (df.dtypes)dicts 对象列出对象设置对象字符串对象元组对象数据类型:对象
但是type
不同,如果需要循环检查:
用于 df 中的 col:打印(df[col].apply(类型))0 1 2 <class 'dict'>名称:字典,数据类型:对象0 <类'列表'>1 <类列表">2 <类'列表'>名称:列表,数据类型:对象0 <类'设置'>1 <类集合">2 <类集合">名称:集合,数据类型:对象0 1 <类'str'>2 <类'str'>名称:字符串,数据类型:对象0 <类'元组'>1 <类'元组'>2 <类'元组'>名称:元组,数据类型:对象
或列的第一个值:
print (type(df['strings'].iat[0]))<类'str'>打印(类型(df['dicts'].iat[0]))<类'dict'>打印(类型(df['lists'].iat[0]))<类'列表'>打印(类型(df['元组'].iat[0]))<类'元组'>打印(类型(df['sets'].iat[0]))<类设置">
使用 布尔索引
如果可能的混合列(那么一些 Pandas 功能可能会被破坏)可以通过 type
过滤:
df = pd.DataFrame({'mixed':['3', 5, 9,'2']})打印 (df)混合的0 31 52 93 2打印(df.dtypes)混合对象数据类型:对象
<小时>
用于 df 中的 col:打印(df[col].apply(类型))0 1 <类'int'>2 3 <class 'str'>名称:混合,数据类型:对象#python 3 - 字符串#python 2 - 基本字符串mask = df['mixed'].apply(lambda x: isinstance(x,str))打印(面具)0 真1 错误2 错误3 真名称:混合,数据类型:booldf = df[掩码]打印 (df)混合的0 33 2
I have a dataframe resultstatsDF
resultstatsDF = DataFrame({'a': [1,2,3,4,5]})
resultstatsDF['file'] = 'asdf'
resultstatsDF.dtypes
a int64
file object
dtype: object
with the object
column file
that I would like to cast to string:
I tried
resultstatsDF = resultstatsDF.astype({'file': str})
resultstatsDF['file'] = resultstatsDF['file'].astype(str)
resultstatsDF['file'] = resultstatsDF['file'].to_string
resultstatsDF['file'] = resultstatsDF.file.apply(str)
resultstatsDF['file'] = resultstatsDF['file'].apply(str)
but whatever I do, when I check with
resultstatsDF.dtypes
the column file
stays to be of tpye object
.
dtype
of string
, dict
, list
is always object
, for testing type
need select some value of column e.g. by iat
:
type(resultstatsDF['file'].iat[0])
Sample:
resultstatsDF = pd.DataFrame({'file':['a','d','f']})
print (resultstatsDF)
file
0 a
1 d
2 f
print (type(resultstatsDF['file'].iloc[0]))
<class 'str'>
print (resultstatsDF['file'].apply(type))
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
Name: file, dtype: object
Sample:
df = pd.DataFrame({'strings':['a','d','f'],
'dicts':[{'a':4}, {'c':8}, {'e':9}],
'lists':[[4,8],[7,8],[3]],
'tuples':[(4,8),(7,8),(3,)],
'sets':[set([1,8]), set([7,3]), set([0,1])] })
print (df)
dicts lists sets strings tuples
0 {'a': 4} [4, 8] {8, 1} a (4, 8)
1 {'c': 8} [7, 8] {3, 7} d (7, 8)
2 {'e': 9} [3] {0, 1} f (3,)
All values have same dtypes
:
print (df.dtypes)
dicts object
lists object
sets object
strings object
tuples object
dtype: object
But type
is different, if need check it by loop:
for col in df:
print (df[col].apply(type))
0 <class 'dict'>
1 <class 'dict'>
2 <class 'dict'>
Name: dicts, dtype: object
0 <class 'list'>
1 <class 'list'>
2 <class 'list'>
Name: lists, dtype: object
0 <class 'set'>
1 <class 'set'>
2 <class 'set'>
Name: sets, dtype: object
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
Name: strings, dtype: object
0 <class 'tuple'>
1 <class 'tuple'>
2 <class 'tuple'>
Name: tuples, dtype: object
Or first value of columns:
print (type(df['strings'].iat[0]))
<class 'str'>
print (type(df['dicts'].iat[0]))
<class 'dict'>
print (type(df['lists'].iat[0]))
<class 'list'>
print (type(df['tuples'].iat[0]))
<class 'tuple'>
print (type(df['sets'].iat[0]))
<class 'set'>
With boolean indexing
if possible mixed column (then some pandas function can be broken) is possible filter by type
:
df = pd.DataFrame({'mixed':['3', 5, 9,'2']})
print (df)
mixed
0 3
1 5
2 9
3 2
print (df.dtypes)
mixed object
dtype: object
for col in df:
print (df[col].apply(type))
0 <class 'str'>
1 <class 'int'>
2 <class 'int'>
3 <class 'str'>
Name: mixed, dtype: object
#python 3 - string
#python 2 - basestring
mask = df['mixed'].apply(lambda x: isinstance(x,str))
print (mask)
0 True
1 False
2 False
3 True
Name: mixed, dtype: bool
df = df[mask]
print (df)
mixed
0 3
3 2
这篇关于Pandas:将列转换为字符串不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!