大 pandas 在一组内进行内插 [英] Pandas interpolate within a groupby
问题描述
我有一个包含以下信息的数据框:
文件名val1 val2
t
1 file1.csv 5 10
2 file1.csv NaN NaN
3 file1.csv 15 20
6 file2.csv NaN NaN
7 file2.csv 10 20
8 file2.csv 12 15
我想根据索引插入数据帧中的值, strong>,但仅限于每个文件组
中。为了插值,我通常会执行
df = df.interpolate(method =index)
并且为了分组,我做了
$ p $ grouped = df.groupby(filename)
我想插入的数据框看起来像这样:
filename val1 val2
t
1 file1.csv 5 10
2 file1.csv 10 15
3 file1.csv 15 20
6 file2.csv NaN NaN
7 file2.csv 10 20
8 file2.csv 12 15
如果NaN在t = 6时仍然存在,因为它们是file2组中的第一项。
我怀疑我需要使用apply ,但一直没能弄清楚......
grouped.apply(interp1d)
...
TypeError:__init __()至少需要3个参数(给出2个)
任何帮助将不胜感激。
>>> df.groupby('filename')。apply(lambda group:group.interpolate(method ='index'))
filename val1 val2
t
1 file1.csv 5 10
2 file1.csv 10 15
3 file1.csv 15 20
6 file2.csv NaN NaN
7 file2.csv 10 20
8 file2.csv 12 15
I've got a dataframe with the following information:
filename val1 val2
t
1 file1.csv 5 10
2 file1.csv NaN NaN
3 file1.csv 15 20
6 file2.csv NaN NaN
7 file2.csv 10 20
8 file2.csv 12 15
I would like to interpolate the values in the dataframe based on the indices, but only within each file group.
To interpolate, I would normally do
df = df.interpolate(method="index")
And to group, I do
grouped = df.groupby("filename")
I would like the interpolated dataframe to look like this:
filename val1 val2
t
1 file1.csv 5 10
2 file1.csv 10 15
3 file1.csv 15 20
6 file2.csv NaN NaN
7 file2.csv 10 20
8 file2.csv 12 15
Where the NaN's are still present at t = 6 since they are the first items in the file2 group.
I suspect I need to use "apply", but haven't been able to figure out exactly how...
grouped.apply(interp1d)
...
TypeError: __init__() takes at least 3 arguments (2 given)
Any help would be appreciated.
>>> df.groupby('filename').apply(lambda group: group.interpolate(method='index'))
filename val1 val2
t
1 file1.csv 5 10
2 file1.csv 10 15
3 file1.csv 15 20
6 file2.csv NaN NaN
7 file2.csv 10 20
8 file2.csv 12 15
这篇关于大 pandas 在一组内进行内插的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!