关于 pandas 插值函数的不确定性 [英] Uncertainity about the Interpolate Function in Pandas

查看:153
本文介绍了关于 pandas 插值函数的不确定性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用pandas中的插值函数.这是一个玩具例子,用来说明问题:

I am working with the interpolate function in pandas. Here is a toy example to make an illustrative case:

df=pd.DataFrame({'Data':np.random.normal(size=200), 'Data2':np.random.normal(size=200)}) 

df.iloc[1, 0] = np.nan

print df

print df.interpolate('nearest')

我的问题:interpolate函数可以在多列上工作吗?也就是说,它是否使用多元分析来确定缺失字段的值?还是仅查看各个列?

My question: Does the interpolate function work over multiple columns? That is, does it use multivariate analysis to determine the value for a missing field? Or does it simply look at individual columns?

推荐答案

文档引用了各种可用的方法-大多数只是依靠index,可能是通过单变量scipy.interp1d或其他单变量scipy方法进行的:

The docs reference the various available methods - most just rely on the index, possibly via the univariate scipy.interp1d or other univariate scipy methods:

方法:{线性",时间",索引",值",最近",零", 线性",二次",三次",重心",克罗格",多项式", 样条曲线","piecewise_polynomial","pchip"}

method : {‘linear’, ‘time’, ‘index’, ‘values’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘barycentric’, ‘krogh’, ‘polynomial’, ‘spline’ ‘piecewise_polynomial’, ‘pchip’}

  • 线性":忽略索引,并将值等距地对待.这是MultiIndexes支持的唯一方法.
  • 默认的时间":插值适用于每日或更高分辨率的数据,以插值给定长度的间隔索引",值":使用索引的实际数值
  • 最近",零",线性",二次",三次",重心",多项式"被传递给scipy.interpolate.interp1d. 多项式"和样条"都要求您还指定一个顺序(int),例如df.interpolate(方法=多项式",顺序= 4).这些使用索引的实际数值.
  • "krogh","piecewise_polynomial","spline"和"pchip"都是类似名称的scipy插值方法的包装.这些使用索引的实际数值.
    • ‘linear’: ignore the index and treat the values as equally spaced. This is the only method supported on MultiIndexes.
    • default ‘time’: interpolation works on daily and higher resolution data to interpolate given length of interval ‘index’, ‘values’: use the actual numerical values of the index
    • ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘barycentric’, ‘polynomial’ is passed to scipy.interpolate.interp1d. Both ‘polynomial’ and ‘spline’ require that you also specify an order (int), e.g. df.interpolate(method=’polynomial’, order=4). These use the actual numerical values of the index.
    • ‘krogh’, ‘piecewise_polynomial’, ‘spline’, and ‘pchip’ are all wrappers around the scipy interpolation methods of similar names. These use the actual numerical values of the index.
    • Scipy文档和显示输出此处

      这篇关于关于 pandas 插值函数的不确定性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆