Python pandas -日与月混合 [英] Python Pandas - Day and Month mix up
问题描述
我有一个'myfile.csv'文件,该文件的'timestamp'列始于
I have a 'myfile.csv' file which has a 'timestamp' column which starts at
(2015年1月5日11:51:00)
(01/05/2015 11:51:00)
并在
(2015年7月5日23:22:00)
(07/05/2015 23:22:00)
总跨度为9,727分钟
A total span of 9,727 minutes
'myfile.csv'也有一个名为'A'的列,它是一个数字值,每分钟内有'A'的多个值,每个值都有唯一的时间戳,最接近的秒数.
'myfile.csv' also has a column named 'A' which is some numerical value, there are values are multiple values for 'A' within each minute, each with a unique timestamp to the nearest second.
我的代码如下
df = pd.read_csv('myfile.csv')
df = df.set_index('timestamp')
df.index = df.index.to_datetime()
df.sort_index(inplace=True)
df = df['A'].resample('1Min').mean()
df.index = (df.index.map(lambda t: t.strftime('%Y-%m-%d %H:%M')))
我的问题是python似乎认为'timestamp'从
My problem is that python seems to think 'timestamp' starts at
(2015年5月1日11:51:00)
(01/05/2015 11:51:00)
-> 1月5日
并在
(2015年7月5日23:22:00)
(07/05/2015 23:22:00)
-> 7月5日
但实际上时间戳记"始于
But really 'timestamp' starts at the
5月1日
并在
5月7日
因此,上面的代码生成的数据帧为261,332行,OMG,而实际上它实际上应该只有9,727行.
So the above code produces a dataframe with 261,332 rows, OMG, when it should really only have 9,727 rows.
Python会以某种方式将月份与日期混淆,误解了日期,我该如何解决呢?
Somehow Python is mixing up the month with the day, misinterpreting the dates, how do I sort this out?
推荐答案
csv_read
中有许多参数可以帮助您将csv中的日期直接解析到您的pandas DataFrame中.在这里,我们可以将想要的列设置为parse_dates
作为日期,然后使用dayfirst
.默认值为false
,因此假定日期在第一列中,则以下内容应按您希望的方式进行.
There are many arguments within csv_read
that can help you parse dates from a csv straight into your pandas DataFrame. Here we can set parse_dates
with the columns you want as dates and then use dayfirst
. This is defaulted to false
so the following should do what you want, assuming the dates are in the first column.
df = pd.read_csv('myfile.csv', parse_dates=[0], dayfirst=True)
如果日期列不是第一行,只需将0
更改为列号.
If the dates column is not the first row, just change the 0
to the column number.
这篇关于Python pandas -日与月混合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!