基于索引的Pandas Dataframe Mask [英] Pandas Dataframe Mask based on index
问题描述
我有以下数据框:
import pandas as pd
index = pd.date_range('2013-1-1',periods=10,freq='15Min')
data = pd.DataFrame(data=[1,2,3,4,5,6,7,8,9,0], columns=['value'], index=index)
如何根据索引值生成掩码?我知道我可以做类似的事情:
How can I generate a mask based on the index value? I know I can do something like:
data['value'] > 3
Out[40]:
2013-01-01 00:00:00 False
2013-01-01 00:15:00 False
2013-01-01 00:30:00 False
2013-01-01 00:45:00 True
2013-01-01 01:00:00 True
2013-01-01 01:15:00 True
2013-01-01 01:30:00 True
2013-01-01 01:45:00 True
2013-01-01 02:00:00 True
2013-01-01 02:15:00 False
Freq: 15T, Name: value, dtype: bool
我想生成一个掩码,以仅考虑索引在特定范围内的某些行.我正在考虑做类似data['index'].time() > datetime.time(1,15)
的操作来生成遮罩.当然,除data['index']
之外,其他操作都会失败,因为索引不是列的名称.如何引用掩码中某行的索引值?
I want to generate a mask to only consider some rows where the index is in a certain range. I was thinking of doing something like data['index'].time() > datetime.time(1,15)
to generate a mask. Except of course data['index']
fails because index is not the name of a column. How can you reference the index value for a row in a mask?
推荐答案
您可以使用indexer_between_time
屏蔽:
In [11]: data.index.indexer_between_time(start='01:15', end='02:00')
Out[11]: array([5, 6, 7, 8])
In [12]: data.iloc[data.index.indexer_between_time(start='1:15', end='02:00')]
Out[12]:
value
2013-01-01 01:15:00 6
2013-01-01 01:30:00 7
2013-01-01 01:45:00 8
2013-01-01 02:00:00 9
如您所见,您可以通过属性.index
访问索引.
As you can see, you access the index by the attribute .index
.
注意:indexer_between_time
默认情况下include_start
和include_end
均为True,它还提供了tz
参数以将时间与其他时区进行比较.
Note: indexer_between_time
by default both include_start
and include_end
are True, it also offers a tz
argument to compare the time to a different timezone.
这篇关于基于索引的Pandas Dataframe Mask的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!