Python3-无法读取docx，odt文件-UnicodeDecodeError:'utf-8'编解码器无法解码位置10的字节0xea:无效的连续字节 [英] Python3 - Cannot read docx, odt file - UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 10: invalid continuation byte

查看：326 发布时间：2020/7/13 5:29:30 python file encoding utf-8 decode

本文介绍了Python3-无法读取docx，odt文件-UnicodeDecodeError:'utf-8'编解码器无法解码位置10的字节0xea:无效的连续字节的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试将大型docx文件拆分为小文件.为此，当使用以下代码在 python3.6 中读取文件时.

I am trying to split a large docx file into small files. For that when reading a file in python3.6 with the following code.

with open('h.docx', 'r') as f:
    a = f.read()

它抛出此错误.

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/local/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
  UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 
  10: invalid continuation byte

h.docx是使用LibreOffice Calc创建的，其中仅包含'hello world'作为内容.我可以在Python 2.7中成功阅读此书，而不会出现任何错误.

h.docx is created using LibreOffice Calc with just 'hello world' in it as content. I can read this successfully in Python 2.7 without any errors.

我尝试了

with open('h.docx', 'r', encoding='latin-1') as f:
    a = f.read()

在这种情况下，我可以读取文件而没有任何错误.但是，当写入另一个文件时，原始内容将丢失.

In this I can read the file without any errors. But when written to another file, the original contents are lost.

也尝试过errors='surrogateescape'，但是当写入另一个文件时，原始内容将丢失.

Also tried errors='surrogateescape', but when written to another file the original contents are lost.

Python3-无法读取docx，odt文件-UnicodeDecodeError:'utf-8'编解码器无法解码位置10的字节0xea:无效的连续字节 [英] Python3 - Cannot read docx, odt file - UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 10: invalid continuation byte

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python3-无法读取docx，odt文件-UnicodeDecodeError:'utf-8'编解码器无法解码位置10的字节0xea:无效的连续字节 [英] Python3 - Cannot read docx, odt file - UnicodeDecodeError: &#39;utf-8&#39; codec can&#39;t decode byte 0xea in position 10: invalid continuation byte

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Python3-无法读取docx，odt文件-UnicodeDecodeError:'utf-8'编解码器无法解码位置10的字节0xea:无效的连续字节 [英] Python3 - Cannot read docx, odt file - UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 10: invalid continuation byte

登录关闭