在 Python 3 中使用 OpenPyXL 复制整列 [英] Copying an entire column using OpenPyXL in Python 3

查看:175
本文介绍了在 Python 3 中使用 OpenPyXL 复制整列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 OpenPyXL 复制整个列.Google 似乎提供了很多使用范围的示例,但没有针对整个列提供.

I'm trying to copy an entire column over using OpenPyXL. Google seems to offer a lot of examples using ranges, but not for an entire column.

我有一个带有单个工作表的工作簿,其中 A 列和 JX 列中包含大量日期(A 包含每月日期,JX 包含季度日期).我希望将每月日期列(在 A:A 中)复制到目标工作簿中以M"结尾的每个工作表,并将季度日期列(在 JX:JX 中)复制到以Q"结尾的工作表.

I have a workbook with a single worksheet with a load of dates in column A and column JX (A contains monthly dates, JX contains quarterly dates). I want the monthly dates column (in A:A) to be copied over to each worksheet ending in 'M' in my target workbook, and the quarterly dates column (in JX:JX) to the worksheets ending in 'Q'.

然而,由于某种原因,最后一个嵌套的 for 循环,for src, dst in zip(ws_base[monthRange], ws_target['A:A']): 只是复制第一个单元格,没有别的.看起来我正在用我的 monthRangequarterRange 字符串识别正确的列,但 Python 并没有遍历整个列,尽管我有两个范围定义.

However, for some reason the last nested for loop, for src, dst in zip(ws_base[monthRange], ws_target['A:A']): is only copying the first cell, and nothing else. It looks like I'm identifying the correct column with my monthRange and quarterRange strings, but Python isn't looping through the whole column despite the fact that I've got two ranges defined.

有人有什么想法吗?

# Load the target workbook
targetwb = openpyxl.load_workbook('pythonOutput.xlsx')


# Load the source workbook
wb_base = openpyxl.load_workbook('Baseline_IFRS9_' + reportingMonth+'.xlsx')

# Go to row 9 and find "Geography:" to identify the relevant 
# month and quarter date columns

sentinel = u"Geography:"
ws_base = wb_base.active

found = 0
dateColumns = []

for column in ws_base:
    for cell in column:
        if cell.value == sentinel:
            dateColumns.append(cell.column) #
            found + 1

            if found == 2:
                break


ColumnM = dateColumns[0]
ColumnQ = dateColumns[1]

print('Monthly col is ' + ColumnM)
print('Quarterly col is ' + ColumnQ)

IndexM = int(openpyxl.utils.column_index_from_string(str(ColumnM)))
IndexQ = int(openpyxl.utils.column_index_from_string(str(ColumnQ)))

print('Monthly col index is ' + str(IndexM))
print('Quarterly col index is ' + str(IndexQ))

print('Proceeding to paste into our new workbook...')

sheetLoop = targetwb.get_sheet_names()


for sheets in sheetLoop:
    if sheets.endswith('Q'):
        ws_target = targetwb[sheets]
        quarterRange = ColumnQ + ':' + ColumnQ

        print('Copying and pasting quarterly dates into: ' + sheets)
        for src, dst in zip(ws_base[quarterRange], ws_target['A:A']):
            dst.value = src.value

    elif sheets.endswith('M'):
        ws_target = targetwb[sheets]
        monthRange = ColumnM + ':' + ColumnM

        print('Copying and pasting monthly dates into: ' + sheets)
        for src, dst in zip(ws_base[monthRange], ws_target['A:A']):
            dst.value = src.value

targetwb.save('pythonOutput.xlsx')

这是我的问题的更简单形式.

Here's a simpler form of my problem.

import openpyxl

wb1 = openpyxl.load_workbook('pythonInput.xlsx')
ws1 = wb1.active

wb2 = openpyxl.load_workbook('pythonOutput.xlsx')
ws2 = wb2.active

for src, dst in zip(ws1['A:A'], ws2['B:B']):
    print( 'Printing from ' + str(src.column) + str(src.row) + ' to ' + str(dst.column) + str(dst.row))
    dst.value = src.value

wb2.save('test.xlsx') 

所以这里的问题是 for 循环只打印从 A1 到 B1.它不应该跨行循环......?

So the problem here is that the for loop only prints from A1 to B1. Shouldn't it be looping down across rows..?

推荐答案

当您在电子表格编辑器中加载新的 XLSX 时,您会在网格中看到大量空单元格.但是,这些空单元格实际上是从文件中省略的,只有当它们具有非空值时才会写入.您可以亲眼看到:XLSX 本质上是一堆 ZIP 压缩的 XML,可以使用任何存档管理器打开.

When you load a new XLSX in a spreadsheet editor, you see lots and lots of empty cells in a grid. However, these empty cells are actually omitted from the file, and they will be only written once they have a non-empty value. You can see for yourself: XLSX is essentially a bunch of ZIP-compressed XMLs, which can be opened with any archive manager.

以类似的方式,OpenPyXL 中的新单元仅在您访问它们时创建.ws2['B:B'] 范围只包含一个单元格 B1,当最短迭代器用完时 zip 停止.

In a similar fashion, new cells in OpenPyXL are only created when you access them. The ws2['B:B'] range only contains one cell, B1, and zip stops when the shortest iterator is exhausted.

考虑到这一点,您可以遍历源范围并使用显式坐标将值保存在正确的单元格中:

With this in mind, you can iterate through the source range and use explicit coordinates to save the values in correct cells:

import openpyxl

wb1 = openpyxl.load_workbook('pythonInput.xlsx')
ws1 = wb1.active

wb2 = openpyxl.load_workbook('pythonOutput.xlsx')
ws2 = wb2.active

for cell in ws1['A:A']:
    print('Printing from ' + str(cell.column) + str(cell.row))
    ws2.cell(row=cell.row, column=2, value=cell.value)

wb2.save('test.xlsx') 

这篇关于在 Python 3 中使用 OpenPyXL 复制整列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆