Openpyxl 在保存时损坏 xlsx.即使没有进行任何更改 [英] Openpyxl corrupts xlsx on save. Even when no changes were made

查看:154
本文介绍了Openpyxl 在保存时损坏 xlsx.即使没有进行任何更改的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TL;DR;

  • 使用 Openpyxl 将更改保存到大型 excel 文件会导致 xlsx 文件损坏
  • Excel 文件由多个带有图表、公式、图像和表格的选项卡组成.
  • Powershell 脚本可以毫无问题地将编辑内容保存到 xlsx 文件中.
  • 我可以使用 Openpyxl 从 excel 文件中读取单元格值,我还可以编辑 &手动保存 xlsx 文件.
  • Excel 文件未受保护.
  • 下面提供了所有错误和代码片段.

我无法将数据添加到我们的一个团队正在使用的 Excel 文件中.excel 文件相当大(+3MB),有几张纸,包含公式和图表,还有图像.

I'm unable to add data to an excel file that one of our teams is using. The excel file is fairly big(at +3MB), has several sheets, contains formulas and graphs and also has images.

谢天谢地,我需要输入数据的工作表没有这些,但是,我发现当我尝试保存工作簿时,我最终遇到了这些错误:

Thankfully the sheet I need to enter data to has none of that, however, I found that when I try to save the workbook, I end up with these errors:

Traceback (most recent call last):
  File "test.py", line 5, in <module>
    wb.save("new.xlsx")
  File "C:\Python3\lib\site-packages\openpyxl\workbook\workbook.py", line 392, in save
    save_workbook(self, filename)
  File "C:\Python3\lib\site-packages\openpyxl\writer\excel.py", line 293, in save_workbook
    writer.save()
  File "C:\Python3\lib\site-packages\openpyxl\writer\excel.py", line 275, in save
    self.write_data()
  File "C:\Python3\lib\site-packages\openpyxl\writer\excel.py", line 78, in write_data
    self._write_charts()
  File "C:\Python3\lib\site-packages\openpyxl\writer\excel.py", line 124, in _write_charts
    self._archive.writestr(chart.path[1:], tostring(chart._write()))
  File "C:\Python3\lib\site-packages\openpyxl\chart\_chart.py", line 134, in _write
    return cs.to_tree()
  File "C:\Python3\lib\site-packages\openpyxl\chart\chartspace.py", line 193, in to_tree
    tree = super(ChartSpace, self).to_tree()
  File "C:\Python3\lib\site-packages\openpyxl\descriptors\serialisable.py", line 154, in to_tree
    node = obj.to_tree(child_tag)
  File "C:\Python3\lib\site-packages\openpyxl\descriptors\serialisable.py", line 154, in to_tree
    node = obj.to_tree(child_tag)
  File "C:\Python3\lib\site-packages\openpyxl\chart\plotarea.py", line 135, in to_tree
    return super(PlotArea, self).to_tree(tagname)
  File "C:\Python3\lib\site-packages\openpyxl\descriptors\serialisable.py", line 146, in to_tree
    for node in nodes:
  File "C:\Python3\lib\site-packages\openpyxl\descriptors\sequence.py", line 105, in to_tree
    el = v.to_tree(namespace=namespace)
  File "C:\Python3\lib\site-packages\openpyxl\chart\_chart.py", line 107, in to_tree
    return super(ChartBase, self).to_tree(tagname, idx)
  File "C:\Python3\lib\site-packages\openpyxl\descriptors\serialisable.py", line 146, in to_tree
    for node in nodes:
  File "C:\Python3\lib\site-packages\openpyxl\descriptors\sequence.py", line 39, in to_tree
    el = v.to_tree(tagname, idx)
  File "C:\Python3\lib\site-packages\openpyxl\chart\series.py", line 170, in to_tree
    return super(Series, self).to_tree(tagname)
  File "C:\Python3\lib\site-packages\openpyxl\descriptors\serialisable.py", line 154, in to_tree
    node = obj.to_tree(child_tag)
  File "C:\Python3\lib\site-packages\openpyxl\descriptors\serialisable.py", line 154, in to_tree
    node = obj.to_tree(child_tag)
AttributeError: 'str' object has no attribute 'to_tree'

这是我用来执行另存为"的代码过程,所以不要在意添加数据等等,保存操作会损坏文件:

This is the code I used to perform the "save as" procedure, so never mind about adding data and whatnot, the save action corrupts the file:

from openpyxl import load_workbook

wb=load_workbook("Production Monitoring Script.xlsx")
ws=wb['Prod Perf Script Data']
wb.save("new.xlsx")

我尝试了一个使用 Powershell 的替代解决方案,它奏效了.

I tried an alternative solution with Powershell and it worked.

$xl=New-Object -ComObject Excel.Application
$wb=$xl.WorkBooks.Open('<path here>\Production Monitoring Script.xlsx')
$ws=$wb.WorkSheets.item(1)
$xl.Visible=$true

$ws.Cells.Item(7, 618)=50

$wb.SaveAs('<path here>\New.xlsx')
$xl.Quit()

它能够保存值50"在那个单元格中.

It was able to save the value "50" in that cell.

推荐答案

正如在原始帖子的评论中所讨论的:openpyxl 不支持某些图形和其他项目,即使它们位于未由您的代码修改的工作表中.这不是一个完整的解决方法,但仅当不受支持的对象位于其他工作表中时才有效.

As discussed in comments on original post: some graphics and other items are not supported by openpyxl, even if they are in worksheets not modified by your code. This is not a full workaround, but works when the unsupported objects are in other worksheets only.

我制作了一个示例 .xlsx 工作簿,其中包含两个工作表,TWC"和UV240 结果".此代码假定标题以结果"结尾的任何工作表都包含不受支持的图像,并创建两个临时文件 - imageoutput 包含不受支持的图像,而 outputtemp 包含可以被 openpyxl 修改而不会损坏的工作表.然后它们最后缝合在一起.

I made an example .xlsx workbook with two worksheets, 'TWC' and 'UV240 Results'. This code assumes that any worksheet whose title ends in 'Results' contains the unsupported images, and creates two temporary files - imageoutput contains the unsupported images, and outputtemp contains the worksheets that may be modified without corruption by openpyxl. Then they're stitched together at the end.

它可能在某些方面效率低下;请编辑或评论改进!

It may be a inefficient in parts; please edit or comment with improvements!

import os
import shutil
import win32com.client

from openpyxl import load_workbook

name = 'spreadsheet.xlsx'
outputfile = 'output.xlsx'
outputtemp = 'outputtemp.xlsx'

shutil.copyfile(name, 'output.xlsx')
wb = load_workbook('output.xlsx')
ws = wb['TWC']

# TWC doesn't have images. Anything ending with 'Results' has unsupported images etc

# Create new file with only openpyxl-unsupported worksheets
imageworksheets = [ws if ws.title.endswith('Results') else '' for ws in wb.worksheets]
if [ws for ws in wb if ws.title != 'TWC']:
    imageoutput = 'output2.xlsx'
    imagefilewritten = False
    while not imagefilewritten:
        try:
            shutil.copy(name, imageoutput)
        except PermissionError as error:
            # Catch an exception here - I usually have a GUI function
            pass
        else:
            imagefilewritten = True

    excel = win32com.client.Dispatch('Excel.Application')
    excel.Visible = False
    imagewb = excel.Workbooks.Open(os.path.join(os.getcwd(), imageoutput))
    excel.DisplayAlerts = False

    for i, ws in enumerate(imageworksheets[::-1]): # Go backwards to avoid reindexing
        if not ws:
            wsindex = len(imageworksheets) - i
            imagewb.Worksheets(wsindex).Delete()

    imagefileupdated = False
    while not imagefileupdated:
        try:
            imagewb.Save()
            imagewb.Close(SaveChanges = True)
            print('Temp image workbook saved.')
        except PermissionError as error:
            # Catch exception
            pass
        else:
            imagefileupdated = True

# Remove the unsupported worksheets in openpyxl
for ws in wb.worksheets:
    if ws in imageworksheets:
        wb.remove(ws)
wb.save(outputtemp)
print('Temp output workbook saved.')

''' Do your desired openpyxl manipulations on the remaining supported worksheet '''

# Merge the outputtemp and imageoutput into outputfile
wb1 = excel.Workbooks.Open(os.path.join(os.getcwd(), outputtemp))
wb2 = excel.Workbooks.Open(os.path.join(os.getcwd(), imageoutput))

for ws in wb1.Sheets:
    ws.Copy(wb2.Sheets(1))

wb2.SaveAs(os.path.join(os.getcwd(), outputfile))
wb1.Close(SaveChanges = True)
wb2.Close(SaveChanges = True)
print(f'Output workbook saved as {outputfile}.')

excel.Visible = True
excel.DisplayAlerts = True

这篇关于Openpyxl 在保存时损坏 xlsx.即使没有进行任何更改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆