使用openpyxl时保持数组公式的性质 [英] Keep the nature of array formulas when using openpyxl

查看:428
本文介绍了使用openpyxl时保持数组公式的性质的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在excel工作簿上工作,该工作簿需要使用openpyxl查找所有列值的中位数,其中另一个列值是特定字符串.为此,我结合使用了MEDIANIF,其中需要按 Ctrl + Shift + Enter 被认为是数组公式.

但是openpyxl在保存文件时不会保留此数组公式的性质.

示例数据:

Values | IDS
3.5    |  1234    
2.5    |  1234    
6.5    |  5687    
7.5    |  1234    
9.5    |  1234    
1.0    |  7894

对于以上数据,我应该只能找到其相邻id = 1234的值的中位数.

我必须使用openpyxl,因此需要一个非数组公式来遍历具有ID的数组列表,并找到每个匹配ID的中值.

解决方案

openpyxl支持对嵌入单元格中的公式进行有限的解析. openpyxl.formula软件包包含Tokenizer类,用于将公式分解为其组成的令牌. Excel公式中的令牌.

令牌具有三个属性:

  • value:导致此令牌的已解析字符串值(实际的公式,就像您在excel中编写的那样,在我们的Median-IF公式中.)
  • type:标识令牌类型的字符串
  • subtype:标识令牌子类型的字符串(可选,默认为")

您的中位数-如果公式可能不同,则下面的代码仅供参考.它建议如何使用openpyxl解析(或说验证)数组公式.用法如下:

>>> from openpyxl.formula import Tokenizer
>>> tok = Tokenizer("""{=MEDIAN(IF($B$1:$B$6="1234",$A$1:$A$6,""))}""")
>>> tok.type = Token.ARRAY
>>> tok.parse()

请注意,openpyxl支持打开工作簿时可以使用 data_only 标志选择的公式或公式的值.但是,openpyxl不会也不会计算公式的结果.使用以下任一方法可以更好地控制Python中的Excel:

  1. pycel
  2. xlwings

署名:openpyxl文档(令牌生成器)

Iam working on a excel workbook that needs to find median of all column values where another column value is a specific string using openpyxl. For this purpose, I have used MEDIAN and IF combined, where Ctrl+Shift+Enter needs to be pressed for it to be considered as an Array formula.

But openpyxl doesn't keep this array formula's nature while saving the file.

Example Data:

Values | IDS
3.5    |  1234    
2.5    |  1234    
6.5    |  5687    
7.5    |  1234    
9.5    |  1234    
1.0    |  7894

For the above data, I should be able to only find median of values whose adjacent id=1234.

I have to use openpyxl and so need a non-array formula to loop through list of arrays with ids and find the medians of each matching id.

解决方案

openpyxl supports limited parsing of formulas embedded in cells. The openpyxl.formula package contains a Tokenizer class to break formulas into their constituent tokens. A token in an Excel formula.

Tokens have three attributes:

  • value: The string value parsed that led to this token (The actual formula, in our Median-IF formula as you would write in excel.)
  • type: A string identifying the type of token
  • subtype: A string identifying subtype of the token (optional, and defaults to "")

Your Median-If formula could be different, the code below is for reference only. It suggest how to parse (or say, validate) the array formulas using openpyxl. Usage is as follows:

>>> from openpyxl.formula import Tokenizer
>>> tok = Tokenizer("""{=MEDIAN(IF($B$1:$B$6="1234",$A$1:$A$6,""))}""")
>>> tok.type = Token.ARRAY
>>> tok.parse()

Kindly note that openpyxl support either the formula or the value of the formula which one can select using the data_only flag when opening a workbook. However, openpyxl does not and will not calculate the result of a formula. Use either of the following for a greater control to Excel in Python:

  1. pycel
  2. xlwings

Attribution: openpyxl documentation (Tokenizer)

这篇关于使用openpyxl时保持数组公式的性质的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆