Pandas 在 groupby 内插值 [英] Pandas interpolate within a groupby

查看:43
本文介绍了Pandas 在 groupby 内插值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含以下信息的数据框:

I've got a dataframe with the following information:

    filename    val1    val2
t                   
1   file1.csv   5       10
2   file1.csv   NaN     NaN
3   file1.csv   15      20
6   file2.csv   NaN     NaN
7   file2.csv   10      20
8   file2.csv   12      15

我想根据索引插入数据框中的值,但仅限于每个文件组.

I would like to interpolate the values in the dataframe based on the indices, but only within each file group.

插入,我通常会做

df = df.interpolate(method="index")

要分组,我愿意

grouped = df.groupby("filename")

我希望内插数据框看起来像这样:

I would like the interpolated dataframe to look like this:

    filename    val1    val2
t                   
1   file1.csv   5       10
2   file1.csv   10      15
3   file1.csv   15      20
6   file2.csv   NaN     NaN
7   file2.csv   10      20
8   file2.csv   12      15

其中 NaN 在 t = 6 时仍然存在,因为它们是 file2 组中的第一项.

Where the NaN's are still present at t = 6 since they are the first items in the file2 group.

我怀疑我需要使用apply",但一直无法弄清楚究竟如何......

I suspect I need to use "apply", but haven't been able to figure out exactly how...

grouped.apply(interp1d)
...
TypeError: __init__() takes at least 3 arguments (2 given)

任何帮助将不胜感激.

推荐答案

>>> df.groupby('filename').apply(lambda group: group.interpolate(method='index'))
    filename  val1  val2
t                       
1  file1.csv     5    10
2  file1.csv    10    15
3  file1.csv    15    20
6  file2.csv   NaN   NaN
7  file2.csv    10    20
8  file2.csv    12    15

这篇关于Pandas 在 groupby 内插值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆