如何在python中解码从文件读取的unicode字符串? [英] How to decode unicode string that is read from a file in Python?

查看:44
本文介绍了如何在python中解码从文件读取的unicode字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含UTF-16字符串的文件.当我尝试读取unicode时,""(双引号)被添加,该字符串看起来像"b'\\ xff \\ xfeA \\ x00'" .内置的 .decode 函数引发 AttributeError:'str'对象没有属性'decode'.我尝试了一些选项,但这些选项无效.

I have a file containing UTF-16 strings. When I try to read the unicode, " " (double quotes) are added and the string looks like "b'\\xff\\xfeA\\x00'". The inbuilt .decode function throws a AttributeError: 'str' object has no attribute 'decode'. I tried a few options but those didn't work.

这就是我正在读取的文件的样子

推荐答案

看起来文件是通过向其写入字节文字而创建的,如下所示:

It looks like the file has been created by writing bytes literals to it, something like this:

some_bytes = b'Hello world'
with open('myfile.txt', 'w') as f:
    f.write(str(some_bytes))

这可以避免以下事实:尝试向以文本模式打开的文件写入字节会引发错误,但代价是该文件现在包含"b'hello world'" (注意引号内的"b".

This gets around the fact that attempting write bytes to a file opened in text mode raises an error, but at the cost that the file now contains "b'hello world'" (note the 'b' inside the quotes).

解决方案是在写入之前将 bytes 解码为 str :

The solution is to decode the bytes to str before writing:

some_bytes = b'Hello world'
my_str = some_bytes.decode('utf-16') # or whatever the encoding of the bytes might be
with open('myfile.txt', 'w') as f:
    f.write(my_str)

或以二进制模式打开文件并直接写入字节

or open the file in binary mode and write the bytes directly

some_bytes = b'Hello world'
with open('myfile.txt', 'wb') as f:
    f.write(some_bytes)

请注意,如果以文本模式打开文件,则需要提供正确的编码

Note you will need to provide the correct encoding if opening the file in text mode

with open('myfile.txt', encoding='utf-16') as f:  # Be sure to use the correct encoding

考虑将运行Python的 -b -bb 标志设置为分别发出警告或异常以检测对字节进行字符串化的尝试.

Consider running Python with the -b or -bb flag set to raise a warning or exception respectively to detect attempts to stringify bytes.

这篇关于如何在python中解码从文件读取的unicode字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆