从变量中的字符串公式计算数据框列? [英] Compute dataframe columns from a string formula in variables?

查看:54
本文介绍了从变量中的字符串公式计算数据框列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用一个excel文件来确定传感器的名称,并使用一个公式来创建新的合成"商品.基于真实传感器的传感器.我想将公式写为字符串,例如"y1 + y2 + y3"而不是"df ['y1'] + df ['y2'] + df ['y3]".但我看不到要使用哪种方法?

I use an excel file in which I determine the names of sensor, and a formula allowing me to create a new "synthetic" sensor based on real sensors. I would like to write the formula as string like for example "y1 + y2 + y3" and not "df ['y1'] + df ['y2'] + df ['y3]" but I don't see which method to use?

Excel文件示例:

Excel file example:

因此,我的脚本必须为此excel文件的每一行创建一个新的传感器.然后,该新传感器将上传到我的数据库中.计算新值的传感器数量是可变的!

My script must therefore create a new sensor for each line of this excel file. This new sensor will then be uploaded to my database. The number of sensors to calculate the new value is variable!

这是我的代码示例:

# From excel file
sensor_cell = '008253_sercit_sercit_batg_b_flr0_g_tctl_z1_tr_prval|officielles_darksky_bierset_temp|officielles_darksky_uccle_temp|005317_esa_001_hur_piece_030110'
formula_cell = "df['y1'] + df['y2'] + df['y3'] + df['y4']"
# formula_cell = 'y1+y2+y3+y4' --> what I would like to be able to write in my  excel file cell

list = sensor_cell.split('|')

df = []
for sensor in list:
    position = list.index(sensor) + 1
    df_y = search_ES(sensor) # Function that return a df with timestamp and value from my database
    df_y = df_y.rename(columns={'value': "y"+ str(position)})
    df.append(df_y)

df = pd.concat(df, axis=1, sort=True)
# I would like to have :
# y1 = df['y1']
# y2 = df['y2']
# y3 = df['y3']
# y4 = df['y4']
df = df.dropna()
print(df)
df['value'] = eval(formula_cell) # Formula from excel file
print(df)

df,然后应用公式:

df before applying the formula :

                           y1    y2    y3  y4
2019-12-11 00:00:00  20.500000  5.62  6.03  29
2019-12-11 01:00:00  21.180000  5.54  6.15  30
2019-12-11 02:00:00  21.020000  5.28  6.29  30
2019-12-11 03:00:00  20.760000  4.99  6.36  29
2019-12-11 04:00:00  20.680000  4.80  6.26  30
2019-12-11 05:00:00  20.760000  4.63  6.07  30
2019-12-11 06:00:00  20.900000  4.49  5.91  30
2019-12-11 07:00:00  20.920000  4.20  6.05  30
2019-12-11 08:00:00  21.320000  4.15  5.95  30
2019-12-11 09:00:00  21.840000  4.42  5.81  30
2019-12-11 10:00:00  22.460000  4.24  5.81  30
2019-12-11 11:00:00  22.240000  4.11  5.89  31
2019-12-11 12:00:00  22.420000  4.43  6.15  32
2019-12-11 13:00:00  21.740000  4.37  6.14  32
2019-12-11 14:00:00  22.500000  4.48  6.24  31
2019-12-11 15:00:00  22.980000  4.87  6.46  32
2019-12-11 16:00:00  22.420000  4.56  6.21  32
2019-12-11 17:00:00  22.320000  4.40  5.92  32
2019-12-11 18:00:00  21.939999  4.52  6.19  32
2019-12-11 19:00:00  20.680000  4.30  5.35  32
2019-12-11 20:00:00  20.900000  4.28  4.94  32
2019-12-11 21:00:00  20.859999  4.55  5.21  32
2019-12-11 22:00:00  20.520000  4.28  4.73  32
2019-12-11 23:00:00  20.320000  4.24  4.90  32

df应用公式后:

                          y1    y2    y3  y4      value
2019-12-11 00:00:00  20.500000  5.62  6.03  29  61.150000
2019-12-11 01:00:00  21.180000  5.54  6.15  30  62.870000
2019-12-11 02:00:00  21.020000  5.28  6.29  30  62.590000
2019-12-11 03:00:00  20.760000  4.99  6.36  29  61.110000
2019-12-11 04:00:00  20.680000  4.80  6.26  30  61.740000
2019-12-11 05:00:00  20.760000  4.63  6.07  30  61.460000
2019-12-11 06:00:00  20.900000  4.49  5.91  30  61.300000
2019-12-11 07:00:00  20.920000  4.20  6.05  30  61.170000
2019-12-11 08:00:00  21.320000  4.15  5.95  30  61.420000
2019-12-11 09:00:00  21.840000  4.42  5.81  30  62.070000
2019-12-11 10:00:00  22.460000  4.24  5.81  30  62.510000
2019-12-11 11:00:00  22.240000  4.11  5.89  31  63.240000
2019-12-11 12:00:00  22.420000  4.43  6.15  32  65.000000
2019-12-11 13:00:00  21.740000  4.37  6.14  32  64.250000
2019-12-11 14:00:00  22.500000  4.48  6.24  31  64.220000
2019-12-11 15:00:00  22.980000  4.87  6.46  32  66.310000
2019-12-11 16:00:00  22.420000  4.56  6.21  32  65.190000
2019-12-11 17:00:00  22.320000  4.40  5.92  32  64.640000
2019-12-11 18:00:00  21.939999  4.52  6.19  32  64.649999
2019-12-11 19:00:00  20.680000  4.30  5.35  32  62.330000
2019-12-11 20:00:00  20.900000  4.28  4.94  32  62.120000
2019-12-11 21:00:00  20.859999  4.55  5.21  32  62.619999
2019-12-11 22:00:00  20.520000  4.28  4.73  32  61.530000
2019-12-11 23:00:00  20.320000  4.24  4.90  32  61.460000

编辑-解决方案:

使用 df.eval 解决了该问题:

formula_cell = 'fictive = y1+y2+y3+y4' 
df.eval(formula_cell, inplace=True)

推荐答案

我认为pandas.query是您想要的. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html

I think pandas.query is what you want. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html

示例:

formula_cell = "y1 + y2 + y3 + y4"
df['value'] = df.query(formula_cell)

这篇关于从变量中的字符串公式计算数据框列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆