pandas.DatetimeIndex的频率为“无",无法设置 [英] pandas.DatetimeIndex frequency is None and can't be set

查看:811
本文介绍了pandas.DatetimeIndex的频率为“无",无法设置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从日期"列创建了DatetimeIndex:

I created a DatetimeIndex from a "date" column:

sales.index = pd.DatetimeIndex(sales["date"])

现在索引如下:

DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-04', '2003-01-06',
                   '2003-01-07', '2003-01-08', '2003-01-09', '2003-01-10',
                   '2003-01-11', '2003-01-13',
                   ...
                   '2016-07-22', '2016-07-23', '2016-07-24', '2016-07-25',
                   '2016-07-26', '2016-07-27', '2016-07-28', '2016-07-29',
                   '2016-07-30', '2016-07-31'],
                  dtype='datetime64[ns]', name='date', length=4393, freq=None)

如您所见,freq属性为无.我怀疑将来的错误是由freq丢失引起的.但是,如果我尝试明确设置频率:

As you see, the freq attribute is None. I suspect that errors down the road are caused by the missing freq. However, if I try to set the frequency explicitly:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-148-30857144de81> in <module>()
      1 #### DEBUG
----> 2 sales_train = disentangle(df_train)
      3 sales_holdout = disentangle(df_holdout)
      4 result = sarima_fit_predict(sales_train.loc[5002, 9990]["amount_sold"], sales_holdout.loc[5002, 9990]["amount_sold"])

<ipython-input-147-08b4c4ecdea3> in disentangle(df_train)
      2     # transform sales table to disentangle sales time series
      3     sales = df_train[["date", "store_id", "article_id", "amount_sold"]]
----> 4     sales.index = pd.DatetimeIndex(sales["date"], freq="d")
      5     sales = sales.pivot_table(index=["store_id", "article_id", "date"])
      6     return sales

/usr/local/lib/python3.6/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
     89                 else:
     90                     kwargs[new_arg_name] = new_arg_value
---> 91             return func(*args, **kwargs)
     92         return wrapper
     93     return _deprecate_kwarg

/usr/local/lib/python3.6/site-packages/pandas/core/indexes/datetimes.py in __new__(cls, data, freq, start, end, periods, copy, name, tz, verify_integrity, normalize, closed, ambiguous, dtype, **kwargs)
    399                                          'dates does not conform to passed '
    400                                          'frequency {1}'
--> 401                                          .format(inferred, freq.freqstr))
    402 
    403         if freq_infer:

ValueError: Inferred frequency None from passed dates does not conform to passed frequency D

因此显然可以推断出一个频率,但它既不存储在DatetimeIndex的freq也不inferred_freq属性中-两者均为None.有人可以消除混乱吗?

So apparently a frequency has been inferred, but is stored neither in the freq nor inferred_freq attribute of the DatetimeIndex - both are None. Can someone clear up the confusion?

推荐答案

您在这里有几个选择:

  • pd.infer_freq
  • pd.tseries.frequencies.to_offset
  • pd.infer_freq
  • pd.tseries.frequencies.to_offset

我怀疑将来的错误是由于缺少频率造成的.

I suspect that errors down the road are caused by the missing freq.

您绝对正确.这是我经常使用的:

You are absolutely right. Here's what I use often:

def add_freq(idx, freq=None):
    """Add a frequency attribute to idx, through inference or directly.

    Returns a copy.  If `freq` is None, it is inferred.
    """

    idx = idx.copy()
    if freq is None:
        if idx.freq is None:
            freq = pd.infer_freq(idx)
        else:
            return idx
    idx.freq = pd.tseries.frequencies.to_offset(freq)
    if idx.freq is None:
        raise AttributeError('no discernible frequency found to `idx`.  Specify'
                             ' a frequency string with `freq`.')
    return idx

一个例子:

idx=pd.to_datetime(['2003-01-02', '2003-01-03', '2003-01-06'])  # freq=None

print(add_freq(idx))  # inferred
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-06'], dtype='datetime64[ns]', freq='B')

print(add_freq(idx, freq='D'))  # explicit
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-06'], dtype='datetime64[ns]', freq='D')

使用asfreq实际上会为丢失的日期重新索引(填充),因此,如果这不是您想要的内容,请当心.

Using asfreq will actually reindex (fill) missing dates, so be careful of that if that's not what you're looking for.

更改频率的主要功能是asfreq功能. 对于DatetimeIndex来说,这基本上只是一个薄而方便的 reindex周围的包装器,生成一个date_range并调用reindex.

The primary function for changing frequencies is the asfreq function. For a DatetimeIndex, this is basically just a thin, but convenient wrapper around reindex which generates a date_range and calls reindex.

这篇关于pandas.DatetimeIndex的频率为“无",无法设置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆