Python编码:打开/读取图像文件,解码图像,重新编码图像 [英] Python Encoding: Open/Read Image File, Decode Image, RE-Encode Image

查看:667
本文介绍了Python编码:打开/读取图像文件,解码图像,重新编码图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意:我对编码/解码了解不多,但是在遇到这个问题之后,这些词对我来说完全是行话了。

Note: I don't know much about Encoding / Decoding, but after I ran into this problem, those words are now complete jargon to me.

问题:
我在这里有点困惑。我正在玩编码/解码图像,将图像存储为django模型中的 TextField ,环顾四周,我发现我可以解码 ascii (我认为还是二进制?无论 open('file','wb')用作编码。我假设默认的 ascii )设置为 latin1 并将其存储在数据库中没有问题。

Question: I'm a little confused here. I was playing around with encoding/decoding images, to store an image as a TextField in a django model, looking around Stack-Overflow I found I could decode an image from ascii(I think or binary? Whatever open('file', 'wb') uses as encoding. I'm assuming the default ascii) to latin1 and store it in a database with no problems.

问题来自根据 latin1 解码数据创建图像。尝试写入文件句柄时,出现 UnicodeEncodeError ascii 编码失败的情况。

The problem comes from creating the image from the latin1 decoded data. When attempting to write to a file-handle I get a UnicodeEncodeError saying ascii encoding failed.

我认为问题在于,当打开文件作为二进制数据( rb )时,这不是正确的 ascii encoding,因为它包含二进制数据。然后我将二进制数据解码为 latin1 ,但是当转换回 ascii 时(尝试写入文件时会自动编码),由于某种未知原因,它会失败。

I think the problem is when opening a file as binary data (rb) it's not a proper asciiencoding, because it contains binary data. Then I decode the binary data to latin1 but when converting back to ascii (auto encodes when trying to write to the file), it fails, for some unknown reason.

我的猜测是,当解码为 latin1 原始二进制文件时数据被转换为其他数据,然后在尝试编码回 ascii 时,它无法识别曾经是原始二进制数据的东西。 (尽管原始数据和解码后的数据具有相同的长度)。
或问题不在于对 latin1 的解码,而是我正在尝试对二进制数据进行ascii编码。在这种情况下,如何将 latin1
数据编码回图像。

My guess is either that when decoding to latin1 the raw binary data get converted to something else, then when trying to encode back to ascii it can't identify what was once raw binary data. (although the original and decoded data have the same length). Or the problem lies not with the decoding to latin1 but that I'm attempting to ascii encode binary data. In which case how would I encode the latin1 data back to an image.

我知道这很令人困惑,但是我对此感到困惑,所以我无法很好地解释它。如果有人能回答这个问题,那可能是个谜语大师。

I know this is very confusing but I'm confused on it all, so I can't explain it well. If anyone can answer this question there probably a riddle master.

一些可视化的代码:

>>> image_handle = open('test_image.jpg', 'rb')
>>> 
>>> raw_image_data = image_handle.read()
>>> latin_image_data = raw_image_data.decode('latin1')
>>> 
>>> 
>>> # The raw data can't be processed by django 
... # but in `latin1` it works
>>> 
>>> # Analysis of the data
>>> 
>>> type(raw_image_data), len(raw_image_data)
(<type 'str'>, 2383864)
>>> 
>>> type(latin_image_data), len(latin_image_data)
(<type 'unicode'>, 2383864)
>>> 
>>> len(raw_image_data) == len(latin_image_data)
True
>>> 
>>> 
>>> # How to write back to as a file?
>>> 
>>> copy_image_handle = open('new_test_image.jpg', 'wb')
>>> 
>>> copy_image_handle.write(raw_image_data)
>>> copy_image_handle.close()
>>> 
>>> 
>>> copy_image_handle = open('new_test_image.jpg', 'wb')
>>> 
>>> copy_image_handle.write(latin_image_data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
>>> 
>>> 
>>> latin_image_data.encode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
>>> 
>>> 
>>> latin_image_data.decode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)


推荐答案

由于jpeg是二进制文件,而ASCII编码是纯文本文件中的纯文本,因此会弹出UnicodeEncodeError。

The UnicodeEncodeError is popping up because a jpeg is a binary file and ASCII encoding is for plain text in plain text files.

可以使用通用文本编辑器(例如Windows的记事本或Linux的nano)创建纯文本文件。大多数将使用ASCII或Unicode编码。当文本编辑器正在读取ASCII文件时,它将抓取一个字节,例如01100001(十进制为97),并找到相应的字形 a。

Plain text files can be created with generic text editors like notepad for Windows or nano for Linux. Most will either use ASCII or Unicode encoding. When a text editor is reading an ASCII file it will grab a byte, say 01100001 (97 in dec), and find the corresponding glyph, 'a'.

因此,当文本编辑器尝试读取jpg时,它将获取相同的字节01100001并得到'a',但是由于该文件包含用于显示照片的信息,因此文本只会变得乱七八糟。尝试在记事本或nano中打开jpeg。

So when a text editor tries to read a jpg it will grab the same byte 01100001 and get 'a', but since the file holds information for displaying a photo the text will just be jibberish. Try opening the jpeg in notepad or nano.

至于编码,这里是一种解释: encode / decode和有什么区别?

As for encoding here is an explanation: What is the difference between encode/decode?

这篇关于Python编码:打开/读取图像文件,解码图像,重新编码图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆