关于 pandas 插值函数的不确定性 [英] Uncertainity about the Interpolate Function in Pandas
问题描述
我正在使用pandas
中的插值函数.这是一个玩具例子,用来说明问题:
I am working with the interpolate function in pandas
. Here is a toy example to make an illustrative case:
df=pd.DataFrame({'Data':np.random.normal(size=200), 'Data2':np.random.normal(size=200)})
df.iloc[1, 0] = np.nan
print df
print df.interpolate('nearest')
我的问题:interpolate
函数可以在多列上工作吗?也就是说,它是否使用多元分析来确定缺失字段的值?还是仅查看各个列?
My question: Does the interpolate
function work over multiple columns? That is, does it use multivariate analysis to determine the value for a missing field? Or does it simply look at individual columns?
推荐答案
文档引用了各种可用的方法-大多数只是依靠index
,可能是通过单变量scipy.interp1d
或其他单变量scipy
方法进行的:
The docs reference the various available methods - most just rely on the index
, possibly via the univariate scipy.interp1d
or other univariate scipy
methods:
方法:{线性",时间",索引",值",最近",零", 线性",二次",三次",重心",克罗格",多项式", 样条曲线","piecewise_polynomial","pchip"}
method : {‘linear’, ‘time’, ‘index’, ‘values’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘barycentric’, ‘krogh’, ‘polynomial’, ‘spline’ ‘piecewise_polynomial’, ‘pchip’}
- 线性":忽略索引,并将值等距地对待.这是MultiIndexes支持的唯一方法.
- 默认的时间":插值适用于每日或更高分辨率的数据,以插值给定长度的间隔索引",值":使用索引的实际数值
- 最近",零",线性",二次",三次",重心",多项式"被传递给scipy.interpolate.interp1d. 多项式"和样条"都要求您还指定一个顺序(int),例如df.interpolate(方法=多项式",顺序= 4).这些使用索引的实际数值.
- "krogh","piecewise_polynomial","spline"和"pchip"都是类似名称的scipy插值方法的包装.这些使用索引的实际数值.
- ‘linear’: ignore the index and treat the values as equally spaced. This is the only method supported on MultiIndexes.
- default ‘time’: interpolation works on daily and higher resolution data to interpolate given length of interval ‘index’, ‘values’: use the actual numerical values of the index
- ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘barycentric’, ‘polynomial’ is passed to scipy.interpolate.interp1d. Both ‘polynomial’ and ‘spline’ require that you also specify an order (int), e.g. df.interpolate(method=’polynomial’, order=4). These use the actual numerical values of the index.
- ‘krogh’, ‘piecewise_polynomial’, ‘spline’, and ‘pchip’ are all wrappers around the scipy interpolation methods of similar names. These use the actual numerical values of the index.
这篇关于关于 pandas 插值函数的不确定性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!