Unicode字符未使用PIL ImageFont呈现 [英] Unicode characters not rendering with PIL ImageFont

查看:112
本文介绍了Unicode字符未使用PIL ImageFont呈现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用方框图字符来写tiff图像,但是所有有问题的字符都显示为:

I'm trying to write tiff images using box drawing characters, but all of the characters in question show up as:

盒形绘图字符(例如┌─┐│└┘╞═╡╤╧╘╛")直接粘贴到源代码中,并且在保存到文本文件中时可以正确显示,但是我没有了解为什么它们没有出现在图像上.

The box draw characters (e.g. "┌─┐│└┘╞═╡╤╧╘╛") were pasted directly into the source code, and they show up correctly when saved to a text file, but I don't understand why they're not showing up on the image.

这是我用来绘制图像的代码示例:

Here is an example of the code I'm using to draw the image:

# coding=utf-8
text = "┌─┐│└┘╞═╡╤╧╘╛"
from PIL import Image, ImageDraw, ImageFont, TiffImagePlugin
img = Image.new("1",(1200,1600),1)
font = ImageFont.truetype("cour.ttf",14,encoding="unic")
draw = ImageDraw.Draw(img)
draw.text((40,0), text, font=font, fill=0)
img.save("imagefile.tif","TIFF")

我正在Windows 7上使用python版本2.7.2.

I'm using python version 2.7.2 on Windows 7.

推荐答案

我不确定这是您遇到的问题,因为您可以通过多种方式获得此解决方案,所以我将介绍所有可能性:

I'm not sure which of these is your problem, because there are multiple ways you can get this, so I'll go over all of the possibilities:

首先,确保文件实际上已保存为UTF-8.默认情况下,记事本和许多其他编辑器会将文件保存为系统编码,这可能类似于cp1252.测试看起来正确"和当脚本将这些字符写入文件并在记事本中打开该文件时,看起来正确"的测试不会告诉您任何内容;显然,如果您保存cp1252文件并以cp1252格式打开,则看起来正确.

First, make sure the file is actually saved as UTF-8. By default, Notepad, and many other editors, will save files in your system encoding, which is probably something like cp1252. Testing that "it looks right" and "when the script writes those characters to a file and I open that file in Notepad, it looks right" doesn't tell you anything; obviously if you save a cp1252 file and open it as cp1252, it looks right.

仅在顶部添加"coding = utf-8"并不会神奇地更改文件的保存方式(除非使用一些智能编辑器,例如emacs).它只是告诉Python此源文件是UTF-8",即使它确实是其他东西也是如此.因此,Python最终将您的cp1252解释为UTF-8并得到mojibake,就像带回音符的a来代替画线的字符一样.

Just adding "coding=utf-8" to the top doesn't magically change how the file is saved (except with a few smart editors, like emacs). It just tells Python that "this source file is UTF-8", even if it's really something else. So, Python ends up interpreting your cp1252 as UTF-8 and getting mojibake, like an a-with-circumflex in place of a line-drawing character.

通常最好使用显式的反斜杠转义符,例如\u250c而不是┌─,尤其是在您甚至不知道如何判断文件是否为UTF-8的情况下,更不用说如何修复该文件了.

You're usually better off using explicit backslash escapes, like \u250c instead of ┌─, especially if you don't even know how to tell if the file is UTF-8, much less how to fix it.

第二,您几乎从不希望将非ASCII字符放入str文字中.除非有充分的理由,否则请使用unicode文字.

Second, you almost never want to put non-ASCII characters into a str literal; use a unicode literal unless you have a good reason to do otherwise.

最重要的是,如果您将draw.textstr传递给PIL,则PIL将使用默认字符集对其进行解码-再次可能不是UTF-8.因此,即使到目前为止其他所有内容都正确,您的代码也将交出一些要解析为cp1252的UTF-8,因此请再次进行mojibake.使用unicode文字将完全避免此问题;否则,您需要通过text.decode('utf-8').

On top of that, if you pass draw.text a str, PIL will decode it with your default charset—which again is probably not UTF-8. So, even if everything else so far were correct, your code would be handing over some UTF-8 to be parsed as cp1252, so mojibake again. Using a unicode literal would avoid this problem entirely; otherwise, you need to pass text.decode('utf-8').

将它们放在一起:

text = u"\u250c\u2500\u2510\u2502\u2514\u2518\u255e\u2550\u2561\u2564\u2567\u2558\u255b"

现在,编码声明和用于保存文件的实际编码都不再重要了,因为文件是纯ASCII码.

And now the coding declaration and the actual encoding used to save the file don't matter, because the file is pure ASCII.

但是您仍然可以得到缺少字符的矩形,因为许多字体没有画线的字符.我不知道您的cour.ttf是什么,但是我在系统上发现了两种Courier TTF字体,一种来自旧的Mac OS,一种来自Windows XP,但都没有.如果是您的问题,显然您需要使用其他字体.

But you may still get the missing-character rectangles, because many fonts don't have the line-drawing characters. I don't know what your cour.ttf is, but I found two Courier TTF fonts on my system, one from an old Mac OS and one from Windows XP, and neither one has them. If that's your problem you obviously need to use a different font.

另一种可能性:如果仍使用上述修复程序来进行mojibake,则cour.ttf可能不是Unicode排序的字体文件,而是较旧的TTF排序之一.字体查看器应显示文件的TTF顺序. (我很确定Windows附带了Windows,但是我不知道Windows 7在Windows中的位置或使用方式.)然后,在加载时需要传递正确的内容代替'unic'作为encoding.字体.但是大多数不是unicsymb的字体可能仍然没有画线的字符.

One other possibility: If you're still getting mojibake with the fixes above, cour.ttf may not be a Unicode-ordered font file, but one of the older TTF orders. A font viewer should show you the TTF order of the file. (I'm pretty sure Windows comes with one, but I have no idea where it is in Windows 7 or how to use it.) Then you need to pass the right thing in place of 'unic' as the encoding when loading the font. But most fonts that aren't either unic or symb probably won't have the line-drawing characters anyway.

这篇关于Unicode字符未使用PIL ImageFont呈现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆