有关二进制文件的一般问题 [英] General question about Binary files

查看:70
本文介绍了有关二进制文件的一般问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一个初学者,在抓取二进制文件时遇到了麻烦.当我以二进制模式(在python中)写入文件时,我只写普通文本.没有关于它的二进制文件.我知道计算机上的每个文件都是二进制文件,但是如果我在文本编辑器中打开文件,我很难区分我以二进制模式编写的文件和音频,视频等文件,这些文件显示为乱码.

I am a beginner and I am having trouble in grasping binary files. When I write to a file in binary mode (in python), I just write normal text. There is nothing binary about it. I know every file on my computer is a binary file but I am having trouble distinguishing between files written in binary mode by me and files like audio, video etc files that show up as gibberish if I open them in a text editor.

如何创建显示为乱码的文件?您能否举一个这样创建的小文件的示例,最好在python中创建?

How are files that show up as gibberish created? Can you please give an example of a small file that is created like this, preferably in python?

我有一种感觉,我在问一个非常愚蠢的问题,但我只想问一个问题.谷歌搜索并没有帮助我.

I have a feeling I am asking a really stupid question but I just had to ask it. Googling around didn't help me.

推荐答案

以下是您的问题的字面答案:

Here's a literal answer to your question:

import struct
with open('gibberish.bin', 'wb') as f:
    f.write(struct.pack('<4d', 3.14159, 42.0, 123.456, 987.654))

这会将这4个浮点数打包为二进制格式(little-endian IEEE 756 64位浮点).

That's packing those 4 floating point numbers into a binary format (little-endian IEEE 756 64-bit floating point).

以下是您需要了解的内容:

Here's (some of) what you need to know:

以二进制模式读写文件不会对您读取或写入的数据进行任何转换.在文本模式下,以及对Unicode的任何解码/编码,根据文本文件"的平台约定对您读取或写入的数据进行转换.

Reading and writing a file in binary mode incurs no transformation on the data that you read or write. In text mode, as well as any decoding/encoding to/from Unicode, the data that you read or write is transformed according to the platform conventions for "text files".

Unix/Linux/Mac OS X:不变

Unix/Linux/Mac OS X: no change

旧Mac:行分隔符为\r,更改为Python标准\n

older Mac: line separator is \r, changed to/from Python standard \n

Windows:行分隔符为\r\n,更改为\n或从\n更改.同样(鲜为人知的事实),Ctrl-Z aka \x1a被解释为文件结尾,这是从CP/M继承的约定,该约定将文件大小记录为所使用的128字节扇区数.

Windows: line separator is \r\n, changed to/from \n. Also (little known fact), Ctrl-Z aka \x1a is interpreted as end-of-file, a convention inherited from CP/M which recorded file sizes as the number of 128-byte sectors used.

这篇关于有关二进制文件的一般问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆