pandas.Series/DataFrame.fillna限制中的错误? [英] Bug in pandas.Series/DataFrame.fillna limit?

查看:195
本文介绍了pandas.Series/DataFrame.fillna限制中的错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试使用fillnavaluelimit关键字填充DataFrame和Series.当不包含value时,将遵守limit,但是一旦包含value,将不再遵守限制.这是一个使用DataFrame的示例:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(5, 3), index=['a', 'c', 'e', 'f', 'h'],columns=['one', 'two','three'])
df2 = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g','h','i', 'j','k'])

In [7]: df2
Out[7]:
        one       two     three
a -0.942695  0.465658 -0.966754
b       NaN       NaN       NaN
c -1.208036  0.287274 -1.116466
d       NaN       NaN       NaN
e  0.041212  0.065966 -1.895570
f  0.869104 -3.481962 -0.280699
g       NaN       NaN       NaN
h -1.151732 -0.310296 -1.701202
i       NaN       NaN       NaN
j       NaN       NaN       NaN
k       NaN       NaN       NaN

In [8]: df2.fillna(method='pad', limit=1)
Out[8]:
        one       two     three
a -0.942695  0.465658 -0.966754
b -0.942695  0.465658 -0.966754
c -1.208036  0.287274 -1.116466
d -1.208036  0.287274 -1.116466
e  0.041212  0.065966 -1.895570
f  0.869104 -3.481962 -0.280699
g  0.869104 -3.481962 -0.280699
h -1.151732 -0.310296 -1.701202
i -1.151732 -0.310296 -1.701202
j       NaN       NaN       NaN
k       NaN       NaN       NaN

In [9]: df2.fillna(value=999,method='pad', limit=1)
Out[9]:
          one         two       three
a   -0.942695    0.465658   -0.966754
b  999.000000  999.000000  999.000000
c   -1.208036    0.287274   -1.116466
d  999.000000  999.000000  999.000000
e    0.041212    0.065966   -1.895570
f    0.869104   -3.481962   -0.280699
g  999.000000  999.000000  999.000000
h   -1.151732   -0.310296   -1.701202
i  999.000000  999.000000  999.000000
j  999.000000  999.000000  999.000000
k  999.000000  999.000000  999.000000

我在这里错过了什么吗,或者这是一个错误吗?

欢呼

在python 2.7和numpy 1.6.1上使用pandas 0.8.1

解决方案

这实际上是设计使然. limit关键字旨在与method关键字一起使用,因为您必须指定顺序(即,前向填充或后向填充),而对于value则没有.

I've been trying to fill a DataFrame and Series using fillna with the value and limit keywords. The limit is respected when not including value, but as soon as including value the limits are no longer respected. Here's an example using a DataFrame:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(5, 3), index=['a', 'c', 'e', 'f', 'h'],columns=['one', 'two','three'])
df2 = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g','h','i', 'j','k'])

In [7]: df2
Out[7]:
        one       two     three
a -0.942695  0.465658 -0.966754
b       NaN       NaN       NaN
c -1.208036  0.287274 -1.116466
d       NaN       NaN       NaN
e  0.041212  0.065966 -1.895570
f  0.869104 -3.481962 -0.280699
g       NaN       NaN       NaN
h -1.151732 -0.310296 -1.701202
i       NaN       NaN       NaN
j       NaN       NaN       NaN
k       NaN       NaN       NaN

In [8]: df2.fillna(method='pad', limit=1)
Out[8]:
        one       two     three
a -0.942695  0.465658 -0.966754
b -0.942695  0.465658 -0.966754
c -1.208036  0.287274 -1.116466
d -1.208036  0.287274 -1.116466
e  0.041212  0.065966 -1.895570
f  0.869104 -3.481962 -0.280699
g  0.869104 -3.481962 -0.280699
h -1.151732 -0.310296 -1.701202
i -1.151732 -0.310296 -1.701202
j       NaN       NaN       NaN
k       NaN       NaN       NaN

In [9]: df2.fillna(value=999,method='pad', limit=1)
Out[9]:
          one         two       three
a   -0.942695    0.465658   -0.966754
b  999.000000  999.000000  999.000000
c   -1.208036    0.287274   -1.116466
d  999.000000  999.000000  999.000000
e    0.041212    0.065966   -1.895570
f    0.869104   -3.481962   -0.280699
g  999.000000  999.000000  999.000000
h   -1.151732   -0.310296   -1.701202
i  999.000000  999.000000  999.000000
j  999.000000  999.000000  999.000000
k  999.000000  999.000000  999.000000

Am I missing something here, or is this a bug?

Cheers

Edit: using pandas 0.8.1 on python 2.7 with numpy 1.6.1

解决方案

This is actually by design. The limit keyword is designed to go with the method keyword because you have to specify the ordering (i.e., forward-fill or back-fill) and you don't have that with value.

这篇关于pandas.Series/DataFrame.fillna限制中的错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆