pandas 中的Excel公式评估 [英] Excel formula evaluation in Pandas
问题描述
说我有一个这样的字典,其键是excel中的单元格引用,值是公式或整数.
Say i have a dict like this whose key's are cell references in excel and value's are either formula or integer.
input_dict = {
"B25": "=B2*B4",
"C25": "=C2*C4",
"D25": "=D2*D4",
"E25": "=E2*E4",
"F25": "=F2*F4",
"G25": "=G2*G4",
"B22": 0,
"C22": "=SUM(B22:B28)",
"D22": "=SUM(C22:C28)",
"E22": "=SUM(D22:D28)",
"F22": "=SUM(E22:E28)",
"G22": "=SUM(F22:F28)",
"B28": "=B19*B20*B22",
"C28": "=C19*C20*C22",
"D28": "=D19*D20*D22",
"E28": "=E19*E20*E22",
"F28": "=F19*F20*F22",
"G28": "=G19*G20*G22",
"B2": 1000,
"C2": 900,
"D2": 880,
"E2": 860,
"F2": 840,
"G2": 800,
"B4": 0.95,
"C4": 0.90,
"D4": 0.80,
"E4": 0.80,
"F4": 0.70,
"G4": 0.60,
"B19": 0.001,
"C19": 0.001,
"D19": 0.001,
"E19": 0.001,
"F19": 0.001,
"G19": 0.002,
"B20": 4,
"C20": 3,
"D20": 4,
"E20": 4,
"F20": 3,
"G20": 4
}
如何对此类数据进行计算.
How to perform the calculation on this type of data.
我的方法是将dict转换为DataFrame并对其执行计算,但是我被卡住了.
My approach was to convert the dict to DataFrame and perform the calculations on it but i am stuck.
df = pd.DataFrame(list(input_dict.items()))
df的输出如下所示.
The output of df is shown below.
0 1
0 G22 =SUM(F22:F28)
1 G4 0.6
2 F2 840
3 D2 880
4 C20 3
5 C4 0.9
6 B28 =B19*B20*B22
7 F25 =F2*F4
8 B25 =B2*B4
9 G25 =G2*G4
10 C28 =C19*C20*C22
11 G28 =G19*G20*G22
12 F22 =SUM(E22:E28)
13 C25 =C2*C4
14 B19 0.001
15 E4 0.8
16 D22 =SUM(C22:C28)
17 D4 0.8
18 G2 800
19 E28 =E19*E20*E22
20 D20 4
21 G20 4
22 E25 =E2*E4
23 F20 3
24 G19 0.002
25 E22 =SUM(D22:D28)
26 C2 900
27 D25 =D2*D4
28 E2 860
29 D28 =D19*D20*D22
30 C19 0.001
31 F28 =F19*F20*F22
32 B20 4
33 B2 1000
34 F4 0.7
35 E19 0.001
36 D19 0.001
37 B4 0.95
38 B22 0
39 F19 0.001
40 C22 =SUM(B22:B28)
41 E20 4
如何在Python中执行类似Excel的计算?
How to perform excel like calculations in Python ?
预期输出如下
{
"B25": "950",
"C25": "810",
"D25": "704",
"E25": "688",
"F25": "588",
"G25": "480",
"B22": 0,
"C22": 950,
"D22": 1757.15,
"E22": 2454.1214,
"F22": 3710.908,
"G22": 4161.220736,
"B28": 0,
"C28": -2.85,
"D28": -7.0286,
"E28": -9.8164856,
"F28": -9.396914743,
"G28": -29.687264,
"B2": 1000,
"C2": 900,
"D2": 880,
"E2": 860,
"F2": 840,
"G2": 800,
"B4": 0.95,
"C4": 0.90,
"D4": 0.80,
"E4": 0.80,
"F4": 0.70,
"G4": 0.60,
"B19": 0.001,
"C19": 0.001,
"D19": 0.001,
"E19": 0.001,
"F19": 0.001,
"G19": 0.002,
"B20": 4,
"C20": 3,
"D20": 4,
"E20": 4,
"F20": 3,
"G20": 4
}
推荐答案
您可以使用regex(正则表达式)和Python的eval
函数.
You could use regex (regular expressions) and Python's eval
function.
假设我们有
d = {'A1': '=A2+A3',
'A2': '=SUM(A3:A5)',
'A3': 3,
'A4': 6,
'A5': -1,
...}
整个功能看起来像
import re
def g(s):
"""Excel-like evaluation with recurrence"""
if isinstance(s,(int, float)):
return s
s=re.sub(r'=', '', s)
s=re.sub(r'SUM\(([A-Z])([0-9]):([A-Z])([0-9])\)','sum([g(d[chr(i)+str(j)]) for j in range(\g<2>,\g<4>+1) for i in range(ord("\g<1>"), ord("\g<3>")+1)])',s)
s=re.sub(r'([A-Z][0-9])',r'g(d["\1"])',s)
return eval(s)
例如
>>> print(g(d['A1']))
11
让我们看一下单个步骤:
Let's have a look at the single steps:
- 首先,我们摆脱了
=
.可以选择编写一个测试,并且仅对以=
开头的公式求值,直到读者为止. -
与
re.sub(r'([A-Z][0-9])', r'g(d["\g<1>"])', any_string)
一起 用一个大写字母和一个数字替换一个组(例如'A3') 并对其进行字典查找(例如'g(d ["A3"])')
- First of all we get rid of the
=
. One could alternatively write a test and only evaluate the formula if it starts with a=
, up to the reader. With
re.sub(r'([A-Z][0-9])', r'g(d["\g<1>"])', any_string)
one replaces a group with one capital letter and a number (e.x. 'A3') with the dictionary look-up of it (e.x. 'g(d["A3"])')
- 如果新的单元格值仍然是公式(重复发生),我们需要再次应用
g()
- 注意:如果在Excel中出现诸如"A $ 3"或"$ AB $ 4"之类的条目,则可以使用
r'$?([A-Z]+)$?([0-9])'
作为搜索模式,并使用r'd["\g<1>\g<2>"]'
作为替换.
- we need to apply
g()
again if the new cell value is still a formula (recurrence) - Note: If there are entries like 'A$3' or '$AB$4' as they occur in Excel, one could use
r'$?([A-Z]+)$?([0-9])'
as search pattern andr'd["\g<1>\g<2>"]'
as substitution.
然后我们可以使用eval()
评估此字符串.到目前为止,您可以使用所有已实现的Python操作,例如+, -, *, /, //, %, etc.
Then we can evaluate this string with eval()
. Up to now one can use all implemented Python operations such as +, -, *, /, //, %, etc.
所有其他功能都需要通过用Python表达式替换来手动实现.她是SUM(A3:B10)
的示例:
All other functions need to be implemented manually by substituting them with Python expressions. Her is an example for SUM(A3:B10)
:
使用r'SUM\(([A-Z])([0-9]):([A-Z])([0-9])\)'
我们搜索公式.
[chr(i)+str(j) for j in range(\g<2>,\g<4>+1) for i in range(ord("\g<1>"), ord("\g<3>")+1)]]
给出总和中的所有表索引.
然后我们对它们每个应用g(d[...])
(重复)并求和.
[chr(i)+str(j) for j in range(\g<2>,\g<4>+1) for i in range(ord("\g<1>"), ord("\g<3>")+1)]]
gives us all table indices that are in the sum.
Then we apply g(d[...])
on each of them (recurrence) and take the sum.
这当然可以扩展到任何Excel公式.
This can be extended to any Excel formula of course.
这篇关于 pandas 中的Excel公式评估的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!