将STRING MONTH中的python pandas中的列转换为INT [英] convert a column in a python pandas from STRING MONTH into INT
问题描述
在Python 2.7.11&中熊猫0.18.1:
In Python 2.7.11 & Pandas 0.18.1:
如果我们有以下csv文件:
If we have the following csv file:
YEAR,MONTH,ID
2011,JAN,1
2011,FEB,1
2011,MAR,1
有没有办法将其作为Pandas数据框读取并将MONTH列转换为这样的字符串?
Is there any way to read it as a Pandas data frame and convert the MONTH column into strings like this?
YEAR,MONTH,ID
2011,1,1
2011,2,1
2011,3,1
某些熊猫函数,例如"dt.strftime('%b')"似乎不起作用.有人可以开导吗?
Some pandas functions such as "dt.strftime('%b')" doesn't seem to work. Could someone enlighten?
推荐答案
我想最简单也是最快的方法之一就是创建一个映射字典和映射,如下所示:
I guess the easiest and one of the fastest method would be to create a mapping dict and map like as follows:
In [2]: df
Out[2]:
YEAR MONTH ID
0 2011 JAN 1
1 2011 FEB 1
2 2011 MAR 1
In [3]: d = {'JAN':1, 'FEB':2, 'MAR':3, 'APR':4, }
In [4]: df.MONTH = df.MONTH.map(d)
In [5]: df
Out[5]:
YEAR MONTH ID
0 2011 1 1
1 2011 2 1
2 2011 3 1
如果并非所有MONTH
值都大写,则可能要使用df.MONTH = df.MONTH.str.upper().map(d)
you may want to use df.MONTH = df.MONTH.str.upper().map(d)
if not all MONTH
values are in upper case
另一种更慢但更可靠的方法:
another more slower but more robust method:
In [11]: pd.to_datetime(df.MONTH, format='%b').dt.month
Out[11]:
0 1
1 2
2 3
Name: MONTH, dtype: int64
UPDATE: we can create a mapping automatically (thanks to @Quetzalcoatl)
import calendar
d = dict((v,k) for k,v in enumerate(calendar.month_abbr))
或(仅使用熊猫):
d = dict(zip(range(1,13), pd.date_range('2000-01-01', freq='M', periods=12).strftime('%b')))
这篇关于将STRING MONTH中的python pandas中的列转换为INT的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!