哪个是从给定日期提取日，月，年的最快方法？ [英] Which is the fastest way to extract day, month and year from a given date?

查看：109 发布时间：2017/4/6 20:39:25 python date datetime pandas

本文介绍了哪个是从给定日期提取日，月，年的最快方法？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我读了一个包含15万行的csv文件到大熊猫数据框中。此数据框的字段为Date，日期为yyyy-mm-dd 格式。我想从中提取月份，日期和年份，并分别复制到数据框的列，月，日和年。对于几百条记录，下面两种方法都可以正常工作，但是对于15万条记录来说，这两个方法要花很长时间才能执行。有没有更快的方法来做这个100,000多个记录？

第一种方法：

 code> df = pandas.read_csv（filename）
 for x in xrange（len（df））：
 df.loc [i，'Day'] = int（df.loc [i ，'Date']。split（' - '）[2]）

第二种方法： p>

  df = pandas.read_csv（filename）
 for x in xrange（len（df））：
 df .loc [i，'Day'] = datetime.strptime（df.loc [i，'Date']，'％Y-％m-％d'）。day 
  / pre> 
 
 谢谢。
解决方案
在0.15.0你将可以使用新的.dt访问器在语法上做到这一点。
 在[36]中：df = DataFrame（date_range （'20000101'，期间= 150000，频率='H'），列= ['日期'] 
 
在[37]中：df.head（5）
输出[37 ]：
日期
 0 2000-01-01 00:00:00 
 1 2000-01-01 01:00:00 
 2 2000-01-01 02:00 ：00 
 3 2000-01-01 03:00:00 
 4 2000-01-01 04:00:00 
 
 [5行x 1列] 
 
在[38]中：％timeit f（df）
 10循环，最佳3：22 ms每循环
 
在[39]中：def f（df）：
 df = df.copy（）
 df ['Year'] = DatetimeIndex（df ['Date']）。 year 
 df ['Month'] = DatetimeIndex（df ['Date']）。month 
 df ['Day'] = DatetimeIndex（df ['Date']）day 
 return df 
 ....：
 
在[40]中：f（df）.head（）
出[40]：
日期年月日
 0 2000-01-01 00:00:00 2000 1 1 
 1 2000-01-01 01:00:00 2000 1 1 
 2 2000-01-01 02:00:00 2000 1 1 
 3 2000-01-01 03:00:00 2000 1 1 
 4 2000-01-01 04:00:00 2000 1 1 
 
 [5行x 4列] 
  
从0.15.0开始（2014年9月底发布），以下是可能使用新的.dt访问器：
  df ['Year'] = df ['Date']。dt.year 
 df ['Month'] = df ['Date']。dt.month 
 df ['Day'] = df ['Date']。dt.day 
  
 
I read a csv file containing 150,000 lines into a pandas dataframe. This dataframe has a field, 'Date', with the dates in yyyy-mm-dd format. I want to extract the month, day and year from it and copy into the dataframes' columns, 'Month', 'Day' and 'Year' respectively. For a few hundred records the below two methods work ok, but for 150,000 records both take a ridiculously long time to execute. Is there a faster way to do this for 100,000+ records?

First method: 
df = pandas.read_csv(filename)
for i in xrange(len(df)): 
   df.loc[i,'Day'] = int(df.loc[i,'Date'].split('-')[2])
Second method: 
df = pandas.read_csv(filename)
for i in xrange(len(df)):
   df.loc[i,'Day'] = datetime.strptime(df.loc[i,'Date'], '%Y-%m-%d').day
Thank you.
 解决方案 
In 0.15.0 you will be able to use the new .dt accessor to do this nice syntactically.
In [36]: df = DataFrame(date_range('20000101',periods=150000,freq='H'),columns=['Date'])

In [37]: df.head(5)
Out[37]: 
                 Date
0 2000-01-01 00:00:00
1 2000-01-01 01:00:00
2 2000-01-01 02:00:00
3 2000-01-01 03:00:00
4 2000-01-01 04:00:00

[5 rows x 1 columns]

In [38]: %timeit f(df)
10 loops, best of 3: 22 ms per loop

In [39]: def f(df):
    df = df.copy()
    df['Year'] = DatetimeIndex(df['Date']).year
    df['Month'] = DatetimeIndex(df['Date']).month
    df['Day'] = DatetimeIndex(df['Date']).day
    return df
   ....: 

In [40]: f(df).head()
Out[40]: 
                 Date  Year  Month  Day
0 2000-01-01 00:00:00  2000      1    1
1 2000-01-01 01:00:00  2000      1    1
2 2000-01-01 02:00:00  2000      1    1
3 2000-01-01 03:00:00  2000      1    1
4 2000-01-01 04:00:00  2000      1    1

[5 rows x 4 columns]
From 0.15.0 on (release in end of Sept 2014), the following is now possible with the new .dt accessor:
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
df['Day'] = df['Date'].dt.day


                        
这篇关于哪个是从给定日期提取日，月，年的最快方法？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

哪个是从给定日期提取日，月，年的最快方法？ [英] Which is the fastest way to extract day, month and year from a given date?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

哪个是从给定日期提取日，月，年的最快方法？ [英] Which is the fastest way to extract day, month and year from a given date?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭