在pandas数据框上使用polyfit,然后将结果添加到新列中 [英] Using polyfit on pandas dataframe and then adding the results to new columns

查看:111
本文介绍了在pandas数据框上使用polyfit,然后将结果添加到新列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的数据框.对于每个ID,我都有(x1,x2),(y1,y2).我想将它们提供给polyfit(),获取斜率和x截距并将其添加为新列.

I have a dataframe like this. For each Id, I have (x1,x2), (y1,y2). I want to supply these to polyfit(), get the slope and the x-intercept and add them as new columns.

    Id        x         y
    1     0.79978   0.018255
    1     1.19983   0.020963
    2     2.39998   0.029006
    2     2.79995   0.033004
    3     1.79965   0.021489
    3     2.19969   0.024194
    4     1.19981   0.019338
    4     1.59981   0.022200
    5     1.79971   0.025629
    5     2.19974   0.028681

我真的需要帮助来对正确的行进行分组并将其提供给polyfit.我一直在为此而苦苦挣扎.任何帮助将是最欢迎的.

I really need help with grouping the correct rows and supplying them to polyfit. I have been struggling with this. Any help would be most welcome.

推荐答案

您可以groupby并在每个组中应用拟合.首先,设置索引,以便以后可以避免合并.

You can groupby and apply the fit within each group. First, set the index so you can avoid a merge later.

import pandas as pd
import numpy as np

df = df.set_index('Id')
df['fit'] = df.groupby('Id').apply(lambda x: np.polyfit(x.x, x.y, 1))

df现在是:

          x         y                                           fit
Id                                                                 
1   0.79978  0.018255  [0.0067691538557680215, 0.01284116612923385]
1   1.19983  0.020963  [0.0067691538557680215, 0.01284116612923385]
2   2.39998  0.029006   [0.00999574968122608, 0.005016400680051043]
2   2.79995  0.033004   [0.00999574968122608, 0.005016400680051043]
3   1.79965  0.021489  [0.006761823817618233, 0.009320083766623343]
3   2.19969  0.024194  [0.006761823817618233, 0.009320083766623343]
...

如果要为每个零件分别使用单独的列,则可以应用pd.Series

If you want separate columns for each part separately, you can apply pd.Series

df[['slope', 'intercept']] = df.fit.apply(pd.Series)
df = df.drop(columns='fit').reset_index()

df现在是:

   Id        x         y     slope  intercept
0   1  0.79978  0.018255  0.006769   0.012841
1   1  1.19983  0.020963  0.006769   0.012841
2   2  2.39998  0.029006  0.009996   0.005016
3   2  2.79995  0.033004  0.009996   0.005016
4   3  1.79965  0.021489  0.006762   0.009320
5   3  2.19969  0.024194  0.006762   0.009320
6   4  1.19981  0.019338  0.007155   0.010753
7   4  1.59981  0.022200  0.007155   0.010753
8   5  1.79971  0.025629  0.007629   0.011898
9   5  2.19974  0.028681  0.007629   0.011898

这篇关于在pandas数据框上使用polyfit,然后将结果添加到新列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆