总结每天大 pandas 的发生次数 [英] summing the number of occurrences per day pandas
问题描述
分数
时间戳
2013 -06-29 00:52:28 + 00:00 -0.420070
2013-06-29 00:51:53 + 00:00 -0.445720
2013-06-28 16:40:43+ 00:00 0.508161
2013-06-28 15:10:30 + 00:00 0.921474
2013-06-28 15:10:17 + 00:00 0.876710
我需要获得测量次数,发生这样的事情,所以我正在寻找这样的东西$ /
count
timestamp
2013-06-29 2
2013-06-28 3
我不在乎情绪栏我想要每天的事件计数。
如果您的 timestamp
index是一个 DatetimeIndex
:
import io
import pandas as pd
content =' ''\
时间戳记
2013-06-29 00:52:28 + 00:00 -0.420070
2013-06-29 00:51:53 + 00:00 -0.445720
2013-06-28 16:40:43 + 00:00 0.508161
2013-06-28 15:10:30 + 00:00 0.921474
2013-06-28 15:10 :17 + 00:00 0.876710
''
df = pd.read_table(io.BytesIO(content),sep ='\s {2,}',parse_dates = [ 0],index_col = [0])
print(df)
所以 df
看起来像这样:
score
timestamp
2013-06-29 00:52:28 -0.420070
2013-06-29 00:51:53 -0.445720
2013-06-28 16:40:43 0.508161
2013-06-28 15:10:30 0.921474
2013-06-28 15:10:17 0.876710
print(df.index)
#< class 'pandas.tseries.index.DatetimeIndex'>
您可以使用:
print(df.groupby(df.index.date).count())
分数
2013-06-28 3
2013-06-29 2
请注意 parse_dates
参数。没有它,索引将只是一个 pandas.core.index.Index
对象。在这种情况下,您不能使用 df.index.date
。
所以答案取决于 type(df.index)
,您尚未显示...
I have a data set like so in a pandas dataframe.
score
timestamp
2013-06-29 00:52:28+00:00 -0.420070
2013-06-29 00:51:53+00:00 -0.445720
2013-06-28 16:40:43+00:00 0.508161
2013-06-28 15:10:30+00:00 0.921474
2013-06-28 15:10:17+00:00 0.876710
I need to get counts for the number of measurements, that occur so I am looking for something like this
count
timestamp
2013-06-29 2
2013-06-28 3
I dont not care about the sentiment column i want the count of the occurrences per day.
If your timestamp
index is a DatetimeIndex
:
import io
import pandas as pd
content = '''\
timestamp score
2013-06-29 00:52:28+00:00 -0.420070
2013-06-29 00:51:53+00:00 -0.445720
2013-06-28 16:40:43+00:00 0.508161
2013-06-28 15:10:30+00:00 0.921474
2013-06-28 15:10:17+00:00 0.876710
'''
df = pd.read_table(io.BytesIO(content), sep='\s{2,}', parse_dates=[0], index_col=[0])
print(df)
so df
looks like this:
score
timestamp
2013-06-29 00:52:28 -0.420070
2013-06-29 00:51:53 -0.445720
2013-06-28 16:40:43 0.508161
2013-06-28 15:10:30 0.921474
2013-06-28 15:10:17 0.876710
print(df.index)
# <class 'pandas.tseries.index.DatetimeIndex'>
You can use:
print(df.groupby(df.index.date).count())
which yields
score
2013-06-28 3
2013-06-29 2
Note the importance of the parse_dates
parameter. Without it, the index would just be a pandas.core.index.Index
object. In which case you could not use df.index.date
.
So the answer depends on the type(df.index)
, which you have not shown...
这篇关于总结每天大 pandas 的发生次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!