从 pandas Timestamp获得MM-DD-YYYY [英] Get MM-DD-YYYY from pandas Timestamp

查看:105
本文介绍了从 pandas Timestamp获得MM-DD-YYYY的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

日期似乎是python中的一个棘手的事情,我有很多麻烦,只是剥离大熊猫TimeStamp的日期。我想从 2013-09-29 02:34:44 09-29-2013



我有一个包含一列Created_date的数据框:

 名称:Created_Date ,长度:1162549,dtype:datetime64 [ns]`

我试过应用 .date()方法,例如: df.Created_Date.date(),但是我收到错误 AttributeError:'Series'对象没有属性'date'



有人可以帮助我吗?

解决方案

地图元素:

 在[239]中:从运算符import methodcaller 

在[240]中:s = Series(date_range(Timestamp('now'),periods = 2) b
$ b在[241]中:s
出[241]:
0 2013-10-01 00:24:16
1 2013-10-02 00:24 :16
dtype:datetime64 [ns]

在[238]中:s.map(lambda x:x.strftime('%d-%m-%Y'))
出[238]:
0 01-10-2013
1 0 2-10-2013
dtype:object

在[242]中:s.map(methodcaller('strftime','%d-%m-%Y'))
出[242]:
0 01-10-2013
1 02-10-2013
dtype:object

您可以通过调用 date() datetime.date code> c $ c> c c c

 在[249]中:s.map(methodcaller('date'))

输出[249]:
0 2013- 10-01
1 2013-10-02
dtype:object

在[250]中:s.map(methodcaller('date'))值

输出[250]:
数组([datetime.date(2013,10,1),datetime.date(2013,10,2)],dtype = object)

然而,另一种方式可以通过调用未绑定的 Timestamp.date 方法:

 在[273]中:s.map(Timestamp.date)
out [273]:
0 2013-10-01
1 2013-10-02
dtype:object

这个方法是最快的,而IMHO是最可读的。 时间戳可以在顶级大熊猫模块中访问,如下所示: pandas.Timestamp 。我直接导入了解释用途。



date 属性 DatetimeIndex 对象做类似的事情,但是返回一个 numpy 对象数组,而不是:

 在[243]中:index = DatetimeIndex(s)

在[244]中:index
Out [244]:
< class'pandas。 tseries.index.DatetimeIndex'>
[2013-10-01 00:24:16,2013-10-02 00:24:16]
长度:2,频率:无,时区:无

在[246]中:index.date
Out [246]:
数组([datetime.date(2013,10,1),datetime.date(2013,10,2)],dtype = object )

对于较大的 datetime64 [ns] code>系列对象,调用 Timestamp.date operator.methodcaller 稍微快于 lambda

 在[263] :f = methodcaller('date')

在[264]中:flam = lambda x:x.date()

在[265]中:fmeth = Timestamp.date

在[266]中:s2 =系列(date_range('20010101',periods = 1000000,freq ='T'))

在[267]中:s2
出[267]:
0 2001-01-01 00:00:00
1 2001-01-01 00:01:00
2 2001-01-01 00:02 :00
3 2001-01-01 00:03:00
4 2001-01-01 00:04:00
5 2001-01-01 00:05:00
6 2001-01-01 00:06:00
7 2001-01-01 00:07:00
8 2001-01-01 00:08:00
9 2001-01-01 00:09:00
10 2001-01-01 00:10:00
11 2001-01-01 00:11:00
12 2001-01-01 00:12:00
13 2001-01-01 00:13:00
14 2001-01-01 00:14:00
...
999985 2002-11-26 10:25:00
999986 2002-11-26 10:26:00
999987 2002-11-26 10:27 :00
999988 2002-11-26 10:28:00
999989 2002-11-26 10:29:00
999990 2002-11-26 10:30:00
999991 2002-11-26 10:31:00
999992 2002-11-26 10:32:00
999993 2002-11-26 10:33:00
999994 2002-11 -26 10:34:00
999995 2002-11-26 10:35:00
999996 2002-11-26 10:36:00
999997 2002-11-26 10:37 :00
999998 2002-11-26 10:38:00
999999 2002-11-26 10:39:00
长度:1000000,dtype:datetime64 [ns]

在[269]:timeit s2.map(f)
1循环,最好的3:1.04每个循环

在[270]:timeit s2.map(flam )
1循环,最好3:1.1 s每循环

在[271]中:timeit s2.map(fmeth)
1循环,最好3:968 ms循环

请记住,大熊猫是在 numpy 之上提供一个图层,以便(大部分时间)你不必处理<$ c的低级细节$ c> ndarray 。因此,获取数组中的原始 datetime.date 对象的用途有限,因为它们与任何 numpy.dtype pandas pandas 仅支持 datetime64 [ns] [那是纳秒] dtypes)。也就是说,有时你需要这样做。


dates seem to be a tricky thing in python, and I am having a lot of trouble simply stripping the date out of the pandas TimeStamp. I would like to get from 2013-09-29 02:34:44 to simply 09-29-2013

I have a dataframe with a column Created_date:

Name: Created_Date, Length: 1162549, dtype: datetime64[ns]`

I have tried applying the .date() method on this Series, eg: df.Created_Date.date(), but I get the error AttributeError: 'Series' object has no attribute 'date'

Can someone help me out?

解决方案

map over the elements:

In [239]: from operator import methodcaller

In [240]: s = Series(date_range(Timestamp('now'), periods=2))

In [241]: s
Out[241]:
0   2013-10-01 00:24:16
1   2013-10-02 00:24:16
dtype: datetime64[ns]

In [238]: s.map(lambda x: x.strftime('%d-%m-%Y'))
Out[238]:
0    01-10-2013
1    02-10-2013
dtype: object

In [242]: s.map(methodcaller('strftime', '%d-%m-%Y'))
Out[242]:
0    01-10-2013
1    02-10-2013
dtype: object

You can get the raw datetime.date objects by calling the date() method of the Timestamp elements that make up the Series:

In [249]: s.map(methodcaller('date'))

Out[249]:
0    2013-10-01
1    2013-10-02
dtype: object

In [250]: s.map(methodcaller('date')).values

Out[250]:
array([datetime.date(2013, 10, 1), datetime.date(2013, 10, 2)], dtype=object)

Yet another way you can do this is by calling the unbound Timestamp.date method:

In [273]: s.map(Timestamp.date)
Out[273]:
0    2013-10-01
1    2013-10-02
dtype: object

This method is the fastest, and IMHO the most readable. Timestamp is accessible in the top-level pandas module, like so: pandas.Timestamp. I've imported it directly for expository purposes.

The date attribute of DatetimeIndex objects does something similar, but returns a numpy object array instead:

In [243]: index = DatetimeIndex(s)

In [244]: index
Out[244]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-10-01 00:24:16, 2013-10-02 00:24:16]
Length: 2, Freq: None, Timezone: None

In [246]: index.date
Out[246]:
array([datetime.date(2013, 10, 1), datetime.date(2013, 10, 2)], dtype=object)

For larger datetime64[ns] Series objects, calling Timestamp.date is faster than operator.methodcaller which is slightly faster than a lambda:

In [263]: f = methodcaller('date')

In [264]: flam = lambda x: x.date()

In [265]: fmeth = Timestamp.date

In [266]: s2 = Series(date_range('20010101', periods=1000000, freq='T'))

In [267]: s2
Out[267]:
0    2001-01-01 00:00:00
1    2001-01-01 00:01:00
2    2001-01-01 00:02:00
3    2001-01-01 00:03:00
4    2001-01-01 00:04:00
5    2001-01-01 00:05:00
6    2001-01-01 00:06:00
7    2001-01-01 00:07:00
8    2001-01-01 00:08:00
9    2001-01-01 00:09:00
10   2001-01-01 00:10:00
11   2001-01-01 00:11:00
12   2001-01-01 00:12:00
13   2001-01-01 00:13:00
14   2001-01-01 00:14:00
...
999985   2002-11-26 10:25:00
999986   2002-11-26 10:26:00
999987   2002-11-26 10:27:00
999988   2002-11-26 10:28:00
999989   2002-11-26 10:29:00
999990   2002-11-26 10:30:00
999991   2002-11-26 10:31:00
999992   2002-11-26 10:32:00
999993   2002-11-26 10:33:00
999994   2002-11-26 10:34:00
999995   2002-11-26 10:35:00
999996   2002-11-26 10:36:00
999997   2002-11-26 10:37:00
999998   2002-11-26 10:38:00
999999   2002-11-26 10:39:00
Length: 1000000, dtype: datetime64[ns]

In [269]: timeit s2.map(f)
1 loops, best of 3: 1.04 s per loop

In [270]: timeit s2.map(flam)
1 loops, best of 3: 1.1 s per loop

In [271]: timeit s2.map(fmeth)
1 loops, best of 3: 968 ms per loop

Keep in mind that one of the goals of pandas is to provide a layer on top of numpy so that (most of the time) you don't have to deal with the low level details of the ndarray. So getting the raw datetime.date objects in an array is of limited use since they don't correspond to any numpy.dtype that is supported by pandas (pandas only supports datetime64[ns] [that's nanoseconds] dtypes). That said, sometimes you need to do this.

这篇关于从 pandas Timestamp获得MM-DD-YYYY的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆