pandas 过滤-非索引列上的between_time [英] Pandas filtering - between_time on a non-index column

查看：57 发布时间：2020/5/24 2:33:09 python pandas

本文介绍了 pandas 过滤-非索引列上的between_time的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要过滤特定时间的数据. DataFrame函数between_time似乎是执行此操作的正确方法，但是，它仅适用于数据帧的索引列；但我需要使用原始格式的数据(例如，数据透视表将期望datetime列具有正确的名称，而不是作为索引).

I need to filter out data with specific hours. The DataFrame function between_time seems to be the proper way to do that, however, it only works on the index column of the dataframe; but I need to have the data in the original format (e.g. pivot tables will expect the datetime column to be with the proper name, not as the index).

这意味着每个过滤器如下所示:

This means that each filter looks something like this:

df.set_index(keys='my_datetime_field').between_time('8:00','21:00').reset_index()

这意味着每次运行此类过滤器时，都会进行两次重新索引操作.

Which implies that there are two reindexing operations every time such a filter is run.

这是一种好习惯还是有一种更合适的方法来做同样的事情?

Is this a good practice or is there a more appropriate way to do the same thing?

推荐答案

创建一个DatetimeIndex，但将其存储在变量中，而不是DataFrame中. 然后调用它的indexer_between_time方法.这将返回一个整数数组，然后该整数数组可用于使用iloc从中选择行:

Create a DatetimeIndex, but store it in a variable, not the DataFrame. Then call it's indexer_between_time method. This returns an integer array which can then be used to select rows from df using iloc:

import pandas as pd
import numpy as np

N = 100
df = pd.DataFrame(
    {'date': pd.date_range('2000-1-1', periods=N, freq='H'),
     'value': np.random.random(N)})

index = pd.DatetimeIndex(df['date'])
df.iloc[index.indexer_between_time('8:00','21:00')]

这篇关于 pandas 过滤-非索引列上的between_time的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 过滤-非索引列上的between_time [英] Pandas filtering - between_time on a non-index column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 过滤-非索引列上的between_time [英] Pandas filtering - between_time on a non-index column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭