大 pandas :to_numeric用于多列 [英] pandas: to_numeric for multiple columns
本文介绍了大 pandas :to_numeric用于多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用以下 df :
c.sort_values('2005', ascending=False).head(3)
GeoName ComponentName IndustryId IndustryClassification Description 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
37926 Alabama Real GDP by state 9 213 Support activities for mining 99 98 117 117 115 87 96 95 103 102 (NA)
37951 Alabama Real GDP by state 34 42 Wholesale trade 9898 10613 10952 11034 11075 9722 9765 9703 9600 9884 10199
37932 Alabama Real GDP by state 15 327 Nonmetallic mineral products manufacturing 980 968 940 1084 861 724 714 701 589 641 (NA)
我想在所有年份中强制使用数字:
I want to force numeric on all of the years:
c['2014'] = pd.to_numeric(c['2014'], errors='coerce')
是否有一种简便的方法来执行此操作,或者我必须将它们全部打出来?
is there an easy way to do this or do I have to type them all out?
推荐答案
更新:之后,您无需转换值,就可以即时读取CSV时:
UPDATE: you don't need to convert your values afterwards, you can do it on-the-fly when reading your CSV:
In [165]: df=pd.read_csv(url, index_col=0, na_values=['(NA)']).fillna(0)
In [166]: df.dtypes
Out[166]:
GeoName object
ComponentName object
IndustryId int64
IndustryClassification object
Description object
2004 int64
2005 int64
2006 int64
2007 int64
2008 int64
2009 int64
2010 int64
2011 int64
2012 int64
2013 int64
2014 float64
dtype: object
如果您需要将多列转换为数字dtype,请使用以下技术:
If you need to convert multiple columns to numeric dtypes - use the following technique:
样本源DF:
In [271]: df
Out[271]:
id a b c d e f
0 id_3 AAA 6 3 5 8 1
1 id_9 3 7 5 7 3 BBB
2 id_7 4 2 3 5 4 2
3 id_0 7 3 5 7 9 4
4 id_0 2 4 6 4 0 2
In [272]: df.dtypes
Out[272]:
id object
a object
b int64
c int64
d int64
e int64
f object
dtype: object
将选定的列转换为数字dtype:
Converting selected columns to numeric dtypes:
In [273]: cols = df.columns.drop('id')
In [274]: df[cols] = df[cols].apply(pd.to_numeric, errors='coerce')
In [275]: df
Out[275]:
id a b c d e f
0 id_3 NaN 6 3 5 8 1.0
1 id_9 3.0 7 5 7 3 NaN
2 id_7 4.0 2 3 5 4 2.0
3 id_0 7.0 3 5 7 9 4.0
4 id_0 2.0 4 6 4 0 2.0
In [276]: df.dtypes
Out[276]:
id object
a float64
b int64
c int64
d int64
e int64
f float64
dtype: object
如果要选择全部 string
(object
)列,请使用以下简单技巧:
PS if you want to select all string
(object
) columns use the following simple trick:
cols = df.columns[df.dtypes.eq('object')]
这篇关于大 pandas :to_numeric用于多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文