pandas 重新索引并填充缺少的值:“索引必须是单调的" [英] Pandas reindex and fill missing values: "Index must be monotonic"
问题描述
在回答这个stackoverflow问题,我在重新索引数据帧时使用fill方法时发现了一些有趣的行为.
In answering this stackoverflow question, I found some interesting behavior when using a fill method while reindexing a dataframe.
此旧错误报告在熊猫中表示,df.reindex(newIndex,method='ffill')
应该等于df.reindex(newIndex).ffill()
,但这不是我所看到的行为
This old bug report in pandas says that df.reindex(newIndex,method='ffill')
should be equivalent to df.reindex(newIndex).ffill()
, but that is NOT the behavior I'm witnessing
下面是说明行为的代码段
Here's a code snippet that illustrates the behavior
df = pd.DataFrame({'values': 2}, index=pd.DatetimeIndex(['2016-06-02', '2016-05-04', '2016-06-03']))
newIndex = pd.DatetimeIndex(['2016-05-04', '2016-06-01', '2016-06-02', '2016-06-03', '2016-06-05'])
print(df.reindex(newIndex).ffill())
print(df.reindex(newIndex, method='ffill'))
第一个打印语句按预期工作.第二个提出一个
The first print statement works as expected. The second raises a
ValueError: index must be monotonic increasing or decreasing
这是怎么回事?
请注意,样本df
故意具有非单调索引.该问题与df.reindex(newIndex, method='ffil')
中的操作顺序有关.我的期望是,如bug报告所言,它应该可以工作-首先用新索引重新索引,然后填充.
Note that the sample df
intentionally has a non-monotonic index. The question pertains to the order of operations in df.reindex(newIndex, method='ffil')
. My expectation is as the bug-report says it should work- first reindex with the new index and then fill.
如您所见,newIndex.is_monotonic
是True
,并且填充被单独调用时起作用,但是当作为reindex
的参数调用时失败.
As you can see, the newIndex.is_monotonic
is True
, and the fill works when called separately but fails when called as a parameter to reindex
.
推荐答案
看来这也需要在列上完成.
It seems that this needs to be done on the columns as well.
In[76]: frame = DataFrame(np.arange(9).reshape((3, 3)), index=['a', 'c', 'd'],columns=['Ohio', 'Texas', 'California'])
In[77]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states)
---> ValueError: index must be monotonic increasing or decreasing
In[78]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states.sort())
Out[78]:
Ohio Texas California
a 0 1 2
b 0 1 2
c 3 4 5
d 6 7 8
这篇关于 pandas 重新索引并填充缺少的值:“索引必须是单调的"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!