pandas 中的Excel公式评估 [英] Excel formula evaluation in Pandas

查看:102
本文介绍了 pandas 中的Excel公式评估的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个这样的字典,其键是excel中的单元格引用,值是公式或整数.

Say i have a dict like this whose key's are cell references in excel and value's are either formula or integer.

input_dict = {
    "B25": "=B2*B4",
    "C25": "=C2*C4",
    "D25": "=D2*D4",
    "E25": "=E2*E4",
    "F25": "=F2*F4",
    "G25": "=G2*G4",
    "B22": 0,
    "C22": "=SUM(B22:B28)",
    "D22": "=SUM(C22:C28)",
    "E22": "=SUM(D22:D28)",
    "F22": "=SUM(E22:E28)",
    "G22": "=SUM(F22:F28)",
    "B28": "=B19*B20*B22",
    "C28": "=C19*C20*C22",
    "D28": "=D19*D20*D22",
    "E28": "=E19*E20*E22",
    "F28": "=F19*F20*F22",
    "G28": "=G19*G20*G22",
    "B2": 1000,
    "C2": 900,
    "D2": 880,
    "E2": 860,
    "F2": 840,
    "G2": 800,
    "B4": 0.95,
    "C4": 0.90,
    "D4": 0.80,
    "E4": 0.80,
    "F4": 0.70,
    "G4": 0.60,
    "B19": 0.001,
    "C19": 0.001,
    "D19": 0.001,
    "E19": 0.001,
    "F19": 0.001,
    "G19": 0.002,
    "B20": 4,
    "C20": 3,
    "D20": 4,
    "E20": 4,
    "F20": 3,
    "G20": 4
  }

如何对此类数据进行计算.

How to perform the calculation on this type of data.

我的方法是将dict转换为DataFrame并对其执行计算,但是我被卡住了.

My approach was to convert the dict to DataFrame and perform the calculations on it but i am stuck.

df = pd.DataFrame(list(input_dict.items()))

df的输出如下所示.

The output of df is shown below.

    0   1
0   G22 =SUM(F22:F28)
1   G4  0.6
2   F2  840
3   D2  880
4   C20 3
5   C4  0.9
6   B28 =B19*B20*B22
7   F25 =F2*F4
8   B25 =B2*B4
9   G25 =G2*G4
10  C28 =C19*C20*C22
11  G28 =G19*G20*G22
12  F22 =SUM(E22:E28)
13  C25 =C2*C4
14  B19 0.001
15  E4  0.8
16  D22 =SUM(C22:C28)
17  D4  0.8
18  G2  800
19  E28 =E19*E20*E22
20  D20 4
21  G20 4
22  E25 =E2*E4
23  F20 3
24  G19 0.002
25  E22 =SUM(D22:D28)
26  C2  900
27  D25 =D2*D4
28  E2  860
29  D28 =D19*D20*D22
30  C19 0.001
31  F28 =F19*F20*F22
32  B20 4
33  B2  1000
34  F4  0.7
35  E19 0.001
36  D19 0.001
37  B4  0.95
38  B22 0
39  F19 0.001
40  C22 =SUM(B22:B28)
41  E20 4

如何在Python中执行类似Excel的计算?

How to perform excel like calculations in Python ?

预期输出如下

{
    "B25": "950",
    "C25": "810",
    "D25": "704",
    "E25": "688",
    "F25": "588",
    "G25": "480",
    "B22": 0,
    "C22": 950,
    "D22": 1757.15,
    "E22": 2454.1214,
    "F22": 3710.908,
    "G22": 4161.220736,
    "B28": 0,
    "C28": -2.85,
    "D28": -7.0286,
    "E28": -9.8164856,
    "F28": -9.396914743,
    "G28": -29.687264,
    "B2": 1000,
    "C2": 900,
    "D2": 880,
    "E2": 860,
    "F2": 840,
    "G2": 800,
    "B4": 0.95,
    "C4": 0.90,
    "D4": 0.80,
    "E4": 0.80,
    "F4": 0.70,
    "G4": 0.60,
    "B19": 0.001,
    "C19": 0.001,
    "D19": 0.001,
    "E19": 0.001,
    "F19": 0.001,
    "G19": 0.002,
    "B20": 4,
    "C20": 3,
    "D20": 4,
    "E20": 4,
    "F20": 3,
    "G20": 4
  }

推荐答案

您可以使用regex(正则表达式)和Python的eval函数.

You could use regex (regular expressions) and Python's eval function.

假设我们有

d = {'A1': '=A2+A3',
     'A2': '=SUM(A3:A5)',
     'A3': 3,
     'A4': 6,
     'A5': -1,
     ...}

整个功能看起来像

import re
def g(s):
    """Excel-like evaluation with recurrence"""
    if isinstance(s,(int, float)):
        return s
    s=re.sub(r'=', '', s)
    s=re.sub(r'SUM\(([A-Z])([0-9]):([A-Z])([0-9])\)','sum([g(d[chr(i)+str(j)]) for j in range(\g<2>,\g<4>+1) for i in range(ord("\g<1>"), ord("\g<3>")+1)])',s)
    s=re.sub(r'([A-Z][0-9])',r'g(d["\1"])',s)
    return eval(s)

例如

>>> print(g(d['A1']))
11

让我们看一下单个步骤:

Let's have a look at the single steps:

  • 首先,我们摆脱了=.可以选择编写一个测试,并且仅对以=开头的公式求值,直到读者为止.
  • re.sub(r'([A-Z][0-9])', r'g(d["\g<1>"])', any_string)一起 用一个大写字母和一个数字替换一个组(例如'A3') 并对其进行字典查找(例如'g(d ["A3"])')

  • First of all we get rid of the =. One could alternatively write a test and only evaluate the formula if it starts with a =, up to the reader.
  • With re.sub(r'([A-Z][0-9])', r'g(d["\g<1>"])', any_string) one replaces a group with one capital letter and a number (e.x. 'A3') with the dictionary look-up of it (e.x. 'g(d["A3"])')

  • 如果新的单元格值仍然是公式(重复发生),我们需要再次应用g()
  • 注意:如果在Excel中出现诸如"A $ 3"或"$ AB $ 4"之类的条目,则可以使用r'$?([A-Z]+)$?([0-9])'作为搜索模式,并使用r'd["\g<1>\g<2>"]'作为替换.
  • we need to apply g() again if the new cell value is still a formula (recurrence)
  • Note: If there are entries like 'A$3' or '$AB$4' as they occur in Excel, one could use r'$?([A-Z]+)$?([0-9])' as search pattern and r'd["\g<1>\g<2>"]' as substitution.

然后我们可以使用eval()评估此字符串.到目前为止,您可以使用所有已实现的Python操作,例如+, -, *, /, //, %, etc.

Then we can evaluate this string with eval(). Up to now one can use all implemented Python operations such as +, -, *, /, //, %, etc.

所有其他功能都需要通过用Python表达式替换来手动实现.她是SUM(A3:B10)的示例:

All other functions need to be implemented manually by substituting them with Python expressions. Her is an example for SUM(A3:B10):

使用r'SUM\(([A-Z])([0-9]):([A-Z])([0-9])\)'我们搜索公式.

[chr(i)+str(j) for j in range(\g<2>,\g<4>+1) for i in range(ord("\g<1>"), ord("\g<3>")+1)]]给出总和中的所有表索引. 然后我们对它们每个应用g(d[...])(重复)并求和.

[chr(i)+str(j) for j in range(\g<2>,\g<4>+1) for i in range(ord("\g<1>"), ord("\g<3>")+1)]] gives us all table indices that are in the sum. Then we apply g(d[...]) on each of them (recurrence) and take the sum.

这当然可以扩展到任何Excel公式.

This can be extended to any Excel formula of course.

这篇关于 pandas 中的Excel公式评估的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆