Openpyxl max_row和max_column错误地报告了一个较大的数字 [英] Openpyxl max_row and max_column wrongly reports a larger figure

查看:973
本文介绍了Openpyxl max_row和max_column错误地报告了一个较大的数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的查询与Im正在开发的解析脚本的一部分有关.我正在尝试编写一个python函数来查找与excel中匹配值相对应的列号. excel是使用openpyxl即时创建的,它具有第一行(来自第3列)标题,每个标题跨越4列合并为一个.在我的后续功能中,我正在解析一些要添加到与匹配标头相对应的列中的内容. (附加信息:我解析的内容是blast +输出.我正在尝试创建一个摘要电子表格,在每一列中均具有命中名称,并在各列中分别包含命中,缺口,跨度和身份.前两列是查询重叠群及其长度.)

My query is to do with a function that is part of a parsing script Im developing. I am trying to write a python function to find the column number corresponding to a matched value in excel. The excel has been created on the fly with openpyxl, and it has the first row (from 3rd column) headers that each span 4 columns merged into one. In my subsequent function, I am parsing some content to be added to the columns corresponding to the matching headers. (Additional info: The content I'm parsing is blast+ output. I'm trying to create a summary spreadsheet with the hit names in each column with subcolumns for hits, gaps, span and identity. The first two columns are query contigs and its length. )

我最初为xlrd编写了一个类似的函数,并且它起作用了.但是,当我尝试为openpyxl重写它时,我发现max_row和max_col函数错误地返回了比实际存在更多的行和列.例如,对于该试验输入,我有20行,但报告为82行. 请注意,我手动选择了空行&列,然后右键单击并删除它们,如本论坛其他地方所建议的那样.这并没有改变错误.

I had initially written a similar function for xlrd and it worked. But when I try to rewrite it for openpyxl, I find that the max_row and max_col function wrongly returns a larger number of rows and columns than actually present. For instance, I have 20 rows for this pilot input, but it reports it as 82. Note that I manually selected the empty rows & columns and right clicked and deleted them, as advised elsewhere in this forum. This didn't change the error.

def find_column_number(x):
    col = 0
    print "maxrow = ", hrsh.max_row
    print "maxcol = ", hrsh.max_column
    for rowz in range(hrsh.max_row):
        print "now the row is ", rowz
        if(rowz > 0): 
            pass
        for colz in range(hrsh.max_column):
            print "now the column is ", colz
            name = (hrsh.cell(row=rowz,column=colz).value)
            if(name == x):
                col = colz
    return col 

有关max_row和max_col的问题已在此处讨论

The issue with max_row and max_col, has been discussed here https://bitbucket.org/openpyxl/openpyxl/issues/514/cell-max_row-reports-higher-than-actual I applied the suggestion here. But the max_row is still wrong.

for row in reversed(hrsh.rows):
    values = [cell.value for cell in row]
    if any(values):
        print("last row with data is {0}".format(row[0].row))
        maxrow = row[0].row

然后,我在 https://www.reddit上尝试了该建议. com/r/learnpython/comments/3prmun/openpyxl_loop_through_and_find_value_of_the/,并尝试获取列值.脚本再次考虑了空列,并报告了比实际出现的列还要多的列.

I then tried the suggestion at https://www.reddit.com/r/learnpython/comments/3prmun/openpyxl_loop_through_and_find_value_of_the/, and tried to get the column values. Once, again the script takes into account the empty columns and reports a higher number columns than actually present.

for currentRow in hrsh.rows:
    for currentCell in currentRow:
        print(currentCell.value)

您能帮我解决此错误,还是建议另一种方法来实现我的目标?

Can you please help me resolve this error, or suggest another method to achieve my aim?

推荐答案

正如您在错误报告中所指出的那样,链接到工作表的报告尺寸与它们是否包含空行或空列之间存在差异.如果max_rowmax_column没有报告您要查看的内容,则需要编写自己的代码以查找第一个完全为空的代码.当然,最有效的方法是从max_row开始并向后工作,但以下可能就足够了:

As noted in the bug report you linked to there's a difference between a sheet's reported dimensions and whether these include empty rows or columns. If max_row and max_column are not reporting what you want to see then you will need to write your own code to find the first completely empty. The most efficient way, of course, would be to start from max_row and work backwards but the following is probably sufficient:

for max_row, row in enumerate(ws, 1):
    if all(c.value is None for c in row):
        break

这篇关于Openpyxl max_row和max_column错误地报告了一个较大的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆