在可能的情况下将字符串转换为在所有pandas列中浮动 [英] Convert strings to float in all pandas columns, where this is possible
问题描述
我从列表列表中创建了一个熊猫数据框
I created a pandas dataframe from a list of lists
import pandas as pd
df_list = [["a", "1", "2"], ["b", "3", np.nan]]
df = pd.DataFrame(df_list, columns = list("ABC"))
>>> A B C
0 a 1 2
1 b 3 NaN
是否有一种方法可以将数据帧的所有列都转换为浮点数,可以将其转换为B和C?如果您知道,下面的方法可以转换哪些列:
Is there a way to convert all columns of the dataframe to float, that can be converted, i.e. B and C? The following works, if you know, which columns to convert:
df[["B", "C"]] = df[["B", "C"]].astype("float")
但是,如果您事先不知道该怎么办,哪些列包含数字?当我尝试
But what do you do, if you don't know in advance, which columns contain the numbers? When I tried
df = df.astype("float", errors = "ignore")
所有列仍然是字符串/对象.同样,
all columns are still strings/objects. Similarly,
df[["B", "C"]] = df[["B", "C"]].apply(pd.to_numeric)
转换两列(由于存在NaN
值,因此"B"为int
并且"C"为"float"),但是
converts both columns (though "B" is int
and "C" is "float", because of the NaN
value being present), but
df = df.apply(pd.to_numeric)
显然会引发错误消息,但我没有找到抑制这种错误的方法.
是否有可能执行这种字符串浮点转换而无需遍历每一列来尝试.astype("float", errors = "ignore")
?
obviously throws an error message and I don't see a way to suppress this.
Is there a possibility to perform this string-float conversion without looping through each column, to try .astype("float", errors = "ignore")
?
推荐答案
我认为您需要to_numeric
:
I think you need parameter errors='ignore'
in to_numeric
:
df = df.apply(pd.to_numeric, errors='ignore')
print (df.dtypes)
A object
B int64
C float64
dtype: object
如果不是混合值,则效果很好-带有字符串的数字:
It working nice if not mixed values - numeric with strings:
df_list = [["a", "t", "2"], ["b", "3", np.nan]]
df = pd.DataFrame(df_list, columns = list("ABC"))
df = df.apply(pd.to_numeric, errors='ignore')
print (df)
A B C
0 a t 2.0 <=added t to column B for mixed values
1 b 3 NaN
print (df.dtypes)
A object
B object
C float64
dtype: object
您也可以将int
下放到float
s:
You can downcast also int
to float
s:
df = df.apply(pd.to_numeric, errors='ignore', downcast='float')
print (df.dtypes)
A object
B float32
C float32
dtype: object
与以下相同:
df = df.apply(lambda x: pd.to_numeric(x, errors='ignore', downcast='float'))
print (df.dtypes)
A object
B float32
C float32
dtype: object
这篇关于在可能的情况下将字符串转换为在所有pandas列中浮动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!