如何获取python生成的Excel文档以正确计算数组公式 [英] How can I get python generated excel document to correctly calculate array formulas
问题描述
我正在使用python 3.6和openpyxl用python生成一些excel文件.
I am generating some excel files with python using python 3.6 and openpyxl.
在某一点上,我必须计算一小部分数据的标准差.在excel中,这是通过数组公式完成的.编写正确的公式很容易,但是当我要求excel计算时,它会显示"#NAME?",除非我手动输入每个单元格并按CTRL + SHIFT + ENTER.
但是,在执行此操作之前,该公式会显示正确的花括号,因此我不确定自己要更改什么.由于我们谈论的是相当多的单元格,手动进入每个单元格是不可行的.有没有办法使excel或Python在excel中正确执行公式?我愿意使用其他库或在必要时编写VB脚本.
At one point I have to calculate standard deviations of a subsection of data. In excel this is done with an array formula. Writing the correct formula is easy, but when I ask excel to calculate - it shows "#NAME?", unless I manually enter each cell and press CTRL+SHIFT+ENTER.
However, prior to doing so, the formula shows the correct curly brackets so I am not certain what I am changing.
Since we are talking about quite a few cells manually entering into each one of them is not feasible. Is there a way to make excel or Python execute the formula correctly in excel? I am open to using other libraries or writing a VB script if necessary.
这是python中工作的最小可行示例:
Here is a minimum workable example of the work in python:
from openpyxl import Workbook
from openpyxl.utils.cell import get_column_letter
from random import random
from random import randrange
wb = Workbook()
ws = wb.create_sheet()
ws.title = 'Data'
#generate random data
for i in range(100):
ws.cell(column=1, row=1+i, value=random()*100)
ws.cell(column=2, row=1+i, value=randrange(1,5))
#calculate standard deviations for specific integer values
for i in range(4):
ws.cell(column=4+i, row=1, value='STDEV {0}'.format(i+1))
ws.cell(column=4+i, row=2, value='=STDEV.P(IF(B1:B100={0},A1:A100," "))'.format(i+1))
targetVal = '{0}2'.format(get_column_letter(4+i))
ws.formula_attributes[targetVal] = {'t': 'array', 'ref': targetVal}
#cleaning up
ws = wb.get_sheet_by_name("Sheet")
wb.remove_sheet(ws)
wb.save("fTest.xlsx")
该示例的简要说明:我在A列中生成100个介于0和100之间的随机数,并在B列中生成介于1和4(含)之间的整数.在包含STDEV 1的单元格下面,我计算A列中所有值的标准差,这些值在B列中为1,对于其他STDEV列也是如此.
生成的excel文档应带有"#NAME?"如果您的excel设置为自动计算,则在STDEV.P单元格中.如果您输入公式并使用CTRL + SHIFT + ENTER执行,则公式将正确执行.
A quick explanation of the example: I generate 100 random numbers between 0 and 100 in column A and integer numbers between 1 and 4 (inclusive) in column B.
Below the cell containing STDEV 1, I calculate the standard deviation of all values in column A that have a 1 in column B, and likewise for the other STDEV columns.
The generated excel document should have "#NAME?" in the STDEV.P cells if your excel is set to automatically calculate. If you enter into the formula and execute with CTRL+SHIFT+ENTER the formula executes correctly.
推荐答案
open-pyxl的STDEV.P函数存在错误.这样添加全名:
open-pyxl has a bug with the STDEV.P function. add the full name as such:
ws.cell(column=4+i, row=2, value='=_xlfn.STDEV.P(IF(B1:B100={},A1:A100,0))'.format(i+1))
它将正常工作.
您还可以在此处看到它.
您可以在文档中看到并非全部公式包括:
You can see in the documentation that not all formulae were included:
如果您尝试使用未知的公式,这可能是因为您使用的是初始规范中未包含的公式.此类公式必须以xlfn为前缀.工作.
要检查是否包含公式,可以使用:
To check if a formula is included you can use:
from openpyxl.utils import FORMULAE
print "STDEV.P" in FORMULAE
如果为false,则应在公式名称之前放置 _xlfn.
.
if false, _xlfn.
should be put before the formula name for it to work.
这篇关于如何获取python生成的Excel文档以正确计算数组公式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!