在 python 中查找 utf-16 文件.如何? [英] utf-16 file seeking in python. how?

查看：41 发布时间：2021/9/15 19:38:24 python utf-16

本文介绍了在 python 中查找 utf-16 文件.如何?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

由于某种原因，我无法找到我的 utf16 文件.它产生UnicodeException:UTF-16 流不以 BOM 开头".我的代码:

For some reason i can not seek my utf16 file. It produces 'UnicodeException: UTF-16 stream does not start with BOM'. My code:

f = codecs.open(ai_file, 'r', 'utf-16')
seek = self.ai_map[self._cbClass.Text]  #seek is valid int
f.seek(seek)
while True:
    ln = f.readline().strip()

我尝试了一些随机的东西，比如首先从流中读取一些东西，但没有帮助.我检查了使用十六进制编辑器寻求的偏移量 - 字符串从字符开始，而不是空字节(我猜它是好兆头，对吧?)那么如何在python中查找utf-16呢?

I tried random stuff like first reading something from stream, didnt help. I checked offset that is seeked to using hex editor - string starts at character, not null byte (i guess its good sign, right?) So how to seek utf-16 in python?

推荐答案

好吧，错误消息告诉您原因:它没有读取字节顺序标记.字节顺序标记位于文件的开头.在没有读取字节顺序标记的情况下，UTF-16 解码器无法知道字节的顺序.显然，它在你第一次阅读时懒惰地这样做，而不是在你打开文件时——否则它假设seek() 正在启动一个新的 UTF-16 流.

Well, the error message is telling you why: it's not reading a byte order mark. The byte order mark is at the beginning of the file. Without having read the byte order mark, the UTF-16 decoder can't know what order the bytes are in. Apparently it does this lazily, the first time you read, instead of when you open the file -- or else it is assuming that the seek() is starting a new UTF-16 stream.

如果您的文件没有 BOM，那肯定是问题所在，您应该在打开文件时指定字节顺序(请参阅下面的 #2).否则，我会看到两个潜在的解决方案:

If your file doesn't have a BOM, that's definitely the problem and you should specify the byte order when opening the file (see #2 below). Otherwise, I see two potential solutions:

在查找之前读取文件的前两个字节以获取 BOM.你似乎说这不起作用，这表明它可能在寻找后期待一个新的 UTF-16 流，所以:

Read the first two bytes of the file to get the BOM before you seek. You seem to say this didn't work, indicating that perhaps it's expecting a fresh UTF-16 stream after the seek, so:

通过使用 utf-16-le 或 utf-16-be 作为打开文件时的编码来明确指定字节顺序.

Specify the byte order explicitly by using utf-16-le or utf-16-be as the encoding when you open the file.

这篇关于在 python 中查找 utf-16 文件.如何?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在 python 中查找 utf-16 文件.如何? [英] utf-16 file seeking in python. how?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在 python 中查找 utf-16 文件.如何? [英] utf-16 file seeking in python. how?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭