pandas 系列中的日期数据排序 [英] Sorting date data in pandas Series
问题描述
数据如下:
0 Thursday
1 Thursday
2 Thursday
3 Thursday
etc, etc
我的代码:
import pandas as pd
data_file = pd.read_csv('./data/Chicago-2016-Summary.csv')
days = data_file['day_of_week']
order = ["Monday","Tuesday","Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
sorted(days, key=lambda x: order.index(x[0]))
print(days)
这将导致错误:
ValueError:"T"不在列表中
ValueError: 'T' is not in list
我试图进行排序并得到此错误,但我不知道这意味着什么.
I tried to sort and get this error but I have no idea what this means.
我只想对周一至周日的数据进行排序,以便可以进行一些可视化处理.有什么建议吗?
I just want to sort the data Monday-Sunday so I can do some visualizations. Any suggestions?
推荐答案
您可以为此使用熊猫的Categorical
数据类型:
You can use pandas' Categorical
data type for this:
order = ["Monday","Tuesday","Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
data_file['day_of_week'] = pd.Categorical(data_file['day_of_week'], categories=order, ordered=True)
data_file.sort_values(by='day_of_week', inplace=True)
在您的示例中,请注意,当您指定
In your example, be aware that when you specify
days = data_file['day_of_week']
您正在创建data_file
框架中该列(系列)的视图.您可能要使用days = data_file['day_of_week'].copy()
.或者,只需像上面那样在DataFrame中工作.
you are creating a view to that column (Series) within your data_file
frame. You may want to use days = data_file['day_of_week'].copy()
. Or, just work within the DataFrame as is done above.
这篇关于 pandas 系列中的日期数据排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!