Python会按年份+月份分割数据表列表 [英] Python split a list of datetimes by year + month

查看：1219 发布时间：2017/4/15 12:56:52 python datetime numpy

本文介绍了Python会按年份+月份分割数据表列表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下csv文件：

 ＃模拟一个csv文件
从StringIO import StringIO 
 data = StringIO（
 2012-04-01,00：10，A，10 
 2012-04-01,00：20，B，11 
 2012-04- 01,00：30，B，12 
 2012-04-02,00：10，A，18 
 2012-05-02,00：20，A，14 
 2012-05 -02,00：30，B，11 
 2012-05-03,00：10，A，10 
 2012-06-03,00：20，B，13 
 2012- 06-03,00：30，C，12 
.strip（））

我想在年+月加类别（即A，B，C）进行gropu。

我希望最终的数据按月分组然后按类别
作为原始数据的视图

  2012-04，A 
 
>> array [0，] => 2012-04-01,00：10，A，10 
 
>> array [3，] => 2012-04-02,00：10，A，18 
 
 2012-04，B 
 
>> array [1，] => 2012-04-01,00：20，B，11 
 
>> array [2，] => 2012-04-01,00：30，B，12 
 
 2012-05，A 
 
>> array [4，] => 2012-05-02,00：20，A，14 
 
 ...

然后，对于每个组，我想迭代使用相同的功能来绘制它们。

我已经看到一个类似的问题，按日期分割天数
将datetimes列表分成几天
，我可以在我的情况下这样a）。但是有一些问题会导致一年+一个月的拆分，如果b）。

这是我到目前为止我遇到的问题的代码片段：

 ＃！ / usr / bin / python 
 
 import numpy as np 
 import csv 
 import os 
 from datetime import datetime 
 
 def strToDate（string ）
d = datetime.strptime（string，'％Y-％m-％d'）
 return d; 
 
 def strToMonthDate（string）：
d = datetime.strptime（string，'％Y-％m-％d'）
 d_by_month = datetime（d.year，d。月，1）
 return d_by_month; 
 
＃模拟一个csv文件
从StringIO import StringIO 
 data = StringIO（
 2012-04-01,00：10，A，10 
 2012-04-01,00：20，B，11 
 2012-04-01,00：30，B，12 
 2012-04-02,00：10，A，18 
 2012-05-02,00：20，A，14 
 2012-05-02,00：30，B，11 
 2012-05-03,00：10，A， 10 
 2012-06-03,00：20，B，13 
 2012-06-03,00：30，C，12 
.strip（））
 
 arr = np.genfromtxt（data，delimiter ='，'，dtype = object）
 
 
＃a）如果我们只按日期分组
 ＃获取唯一日期
 #keys = np.unique（arr [：，0]）
＃keys1 = np.unique（arr [：，2]）
＃按唯一日期分组$键
＃打印键
＃在key1中的key1：
＃group = arr [（arr [：，0] == key）& （arr [：，2] == key1）] 
＃if group.size：
＃print\t+ key1 
＃print group 
＃print\\ \\ n
 
＃b）但是，如果我们要按年份+月份分组
 dates_by_month = np.array（map（strToMonthDate，arr [：，0]））
 keys2 = np.unique（dates_by_month）
打印日期_by_month 
＃>> [datetime.datetime（2012，4，1，0，0），datetime.datetime（2012，4，1，0，0），... 
打印\\\

打印键2 
＃>> [2012-04-01 00:00:00 2012-05-01 00:00:00 2012-06-01 00:00:00] 
 
 key2中的键：
打印键
打印类型（键）
 group = arr [dates_by_month == key] 
打印组
打印\\\

问题：我获得每月密钥，但对于该组，我所得到的是[2012-04-01 00:10 A 10]。 key2中的键的类型为datetime.datetime。任何想法可能是错的？欢迎任何替代实施建议。我不想使用itertools.groupby解决方案，因为它返回一个迭代器而不是一个数组，这不太适合绘图。

Edit1： / strong>问题解决了。问题是，在b）的情况下，我预先索引的dates_by_month应该初始化为np.array而不是列表，该映射返回dates_by_month = np.array（map（strToMonthDate，arr [：，0]））。我已经在上面的代码段中修复了它，现在的例子就是这个例子。

解决方案

我发现问题在我原来的解决方案。

如果b），
dates_by_month = map（strToMonthDate ，arr [：，0]）
返回一个列表，而不是一个numpy数组。提前索引：
group = arr [dates_by_month == key]

将无法正常工作。如果相反，我有：
dates_by_month = np.array（map（strToMonthDate，arr [：，0]））
然后分组按预期工作。

I have the following csv files:
# simulate a csv file from StringIO import StringIO data = StringIO(""" 2012-04-01,00:10, A, 10 2012-04-01,00:20, B, 11 2012-04-01,00:30, B, 12 2012-04-02,00:10, A, 18 2012-05-02,00:20, A, 14 2012-05-02,00:30, B, 11 2012-05-03,00:10, A, 10 2012-06-03,00:20, B, 13 2012-06-03,00:30, C, 12 """.strip())
which I would like to gropu by year+month plus category (ie. A, B, C).

I would like the final data to have grouping by month and then by category as a view of the original data
2012-04, A >> array[0,] => 2012-04-01,00:10, A, 10 >> array[3,] => 2012-04-02,00:10, A, 18 2012-04, B >> array[1,] => 2012-04-01,00:20, B, 11 >> array[2,] => 2012-04-01,00:30, B, 12 2012-05, A >> array[4,] => 2012-05-02,00:20, A, 14 ...
And then for each group, I would like iterate to plot them using the same function.

I have seen a similar question on splitting by dates by days Split list of datetimes into days and I am able to to so in my case a). But having some issues turning that into a year+month split in case b).

Here is the snippet that I have so far with the issue that I am running into:
#! /usr/bin/python import numpy as np import csv import os from datetime import datetime def strToDate(string): d = datetime.strptime(string, '%Y-%m-%d') return d; def strToMonthDate(string): d = datetime.strptime(string, '%Y-%m-%d') d_by_month = datetime(d.year,d.month,1) return d_by_month; # simulate a csv file from StringIO import StringIO data = StringIO(""" 2012-04-01,00:10, A, 10 2012-04-01,00:20, B, 11 2012-04-01,00:30, B, 12 2012-04-02,00:10, A, 18 2012-05-02,00:20, A, 14 2012-05-02,00:30, B, 11 2012-05-03,00:10, A, 10 2012-06-03,00:20, B, 13 2012-06-03,00:30, C, 12 """.strip()) arr = np.genfromtxt(data, delimiter=',', dtype=object) # a) If we were to just group by dates # Get unique dates #keys = np.unique(arr[:,0]) #keys1 = np.unique(arr[:,2]) # Group by unique dates #for key in keys: # print key # for key1 in keys1: # group = arr[ (arr[:,0]==key) & (arr[:,2]==key1) ] # if group.size: # print "\t" + key1 # print group # print "\n" # b) But if we want to group by year+month in the dates dates_by_month = np.array(map(strToMonthDate, arr[:,0])) keys2 = np.unique(dates_by_month) print dates_by_month # >> [datetime.datetime(2012, 4, 1, 0, 0), datetime.datetime(2012, 4, 1, 0, 0), ... print "\n" print keys2 # >> [2012-04-01 00:00:00 2012-05-01 00:00:00 2012-06-01 00:00:00] for key in keys2: print key print type(key) group = arr[dates_by_month==key] print group print "\n"
Question: I get the monthly key but for the group, all I get is [2012-04-01 00:10 A 10] for each group. key in keys2 is of type datetime.datetime. Any idea what could be wrong? Any alternative implementations suggestions are welcome. I would prefer not to use a itertools.groupby solution, as it returns an iterator rather than an array, which is less suitable for plotting.

Edit1: Problem solved. The issue was that the dates_by_month that I used in advance indexing in case b) should be initialized as an np.array instead of a list which map returns dates_by_month = np.array(map(strToMonthDate, arr[:,0])). I have fixed it in the snippet above, and the example now works.
解决方案
I found where the issue was in my original solution.

In case b), the
dates_by_month = map(strToMonthDate, arr[:,0])
returns a list instead of a numpy array. The advance indexing:
group = arr[dates_by_month==key]
therefore would not work. If instead, I have:
dates_by_month = np.array(map(strToMonthDate, arr[:,0]))
then the grouping works as expected.

这篇关于Python会按年份+月份分割数据表列表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python会按年份+月份分割数据表列表 [英] Python split a list of datetimes by year + month

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python会按年份+月份分割数据表列表 [英] Python split a list of datetimes by year + month

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭