Python Pandas中的TimeDeltas总和 [英] Sum overflow TimeDeltas in Python Pandas

查看:485
本文介绍了Python Pandas中的TimeDeltas总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在尝试对熊猫中的时间增量求和时,它似乎适用于切片,但不适用于整个列.

While trying to sum across timedeltas in pandas, it seems to work for a slice but not the whole column.

>> d.ix[0:100, 'VOID-DAYS'].sum()
Timedelta('2113 days 00:00:00')

>> d['VOID-DAYS'].sum()


ValueError: overflow in timedelta operation

推荐答案

如果VOID-DAYS表示整数天,则将Timedeltas转换为整数:

If VOID-DAYS represents an integer number of days, convert the Timedeltas into integers:

df['VOID-DAYS'] = df['VOID-DAYS'].dt.days


import numpy as np
import pandas as pd
df = pd.DataFrame({'VOID-DAYS': pd.to_timedelta(np.ones((106752,)), unit='D')})
try:
    print(df['VOID-DAYS'].sum())
except ValueError as err:
    print(err)
    # overflow in timedelta operation


df['VOID-DAYS'] = df['VOID-DAYS'].dt.days
print(df['VOID-DAYS'].sum())
# 106752


如果Timedelta包含秒或更小的单位,请使用


If the Timedeltas include seconds or smaller units, then use

df['VOID-DAYS'] = df['VOID-DAYS'].dt.total_seconds()

将值转换为浮点数.

Pandas Timedeltas(系列和TimedeltaIndexes)将所有timedelta存储为与NumPy的timedelta64[ns] dtype兼容的整数.此dtype使用8字节的整数存储时间增量(以纳秒为单位).

Pandas Timedeltas (Series and TimedeltaIndexes) store all timedeltas as ints compatible with NumPy's timedelta64[ns] dtype. This dtype uses 8-byte ints to store the timedelta in nanoseconds.

以这种格式表示的最大天数是

The largest number of days representable in this format is

In [73]: int(float(np.iinfo(np.int64).max) / (10**9 * 3600 * 24))
Out[73]: 106751

这是为什么

In [74]: pd.Series(pd.to_timedelta(np.ones((106752,)), unit='D')).sum()
ValueError: overflow in timedelta operation

引发ValueError,但是

In [75]: pd.Series(pd.to_timedelta(np.ones((106751,)), unit='D')).sum()
Out[75]: Timedelta('106751 days 00:00:00')

没有.

这篇关于Python Pandas中的TimeDeltas总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆