使用Python以纯文本格式读取二进制文件 [英] Reading a binary file as plain text using Python

查看:70
本文介绍了使用Python以纯文本格式读取二进制文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的一个朋友使用C的 fprintf 函数编写了简单的诗歌.它是使用'wb'选项编写的,因此生成的文件为二进制文件.我想用Python以纯文本形式显示诗歌.

A friend of mine has written simple poetry using C's fprintf function. It was written using the 'wb' option so the generated file is in binary. I'd like to use Python to show the poetry in plain text.

我现在得到的是很多这样的字符串:

What I'm currently getting are lots of strings like this: ��������

我正在使用的代码:

with open("read-me-if-you-can.bin", "rb") as f:
      print f.read()

f.close()

推荐答案

问题是,当处理写入文件的文本时,您必须知道(或正确猜测)编写所述文件时使用的字符编码.如果读取文件的程序在此处使用了错误的编码,那么如果幸运的话,您将在文本中出现奇怪的字符,如果不幸的话,您将得到完全的垃圾.

The thing is, when dealing with text written to a file, you have to know (or correctly guess) the character encoding used when writing said file. If the program reading the file is assuming the wrong encoding here, you will end up with strange characters in the text if you're lucky and with utter garbage if you're unlucky.

不要试图猜测,要知道:您需要询问您的朋友,他或她以哪种字符编码将诗歌文本写入文件.然后,您必须在Python中打开文件,并指定该字符编码.假设他/她的回答是"UTF-16-LE"(出于示例目的),然后您输入:

Don't try to guess, try to know: you need to ask your friend in what character encoding he or she wrote the poetry text to the file. You then have to open the file in Python specifying that character encoding. Let's say his/her answer is "UTF-16-LE" (for sake of example), you then write:

with open("poetry.bin", encoding="utf-16-le") as f:
    print(f.read())

尽管您似乎仍在使用Python 2,所以您可以在其中编写:

It seems you're on Python 2 still though, so there you write:

import io
with io.open("poetry.bin", encoding="utf-16-le") as f:
    print f.read()

您可以先尝试使用UTF-8,这是一种常用的编码方式.

You could start by trying UTF-8 first though, that is an often used encoding.

这篇关于使用Python以纯文本格式读取二进制文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆