如何在python中显示非英文字符? [英] How do I display non-english characters in python?

查看:269
本文介绍了如何在python中显示非英文字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个python字典包含非英文字符的项目。当我打印字典时,python shell没有正确显示非英文字符。如何解决这个问题?

解决方案

当您的应用打印 hei\xdfen 而不是heißen,这意味着你实际上并不打印unicode字符串,而是打印unicode对象的字符串表示。



让我们假设你的字符串(heißen)被存储到变量叫做 text 。只要确定你在哪里,可以通过调用以下方式查看此变量的类型:

 >>>类型(文本)

如果您获得< type'unicode'> ,这意味着你不是处理一个字符串,而是一个 unicode 对象。



如果您通过调用 print(text)来创建内容并尝试打印到文本,则不会找出实际文本(heißen),



为了解决这个问题,你需要知道你的终端有哪些编码,打印出你的unicode对象根据给定的编码编码



例如,如果您的终端使用UTF-8编码,则可以通过调用以下命令打印出一个字符串: / p>

  text.encode('utf-8')

这是基本概念。现在让我给你一个更详细的例子。让我们假设我们有一个存储你的字典的源代码文件。喜欢:

  mydict = {'heiße':'heiße','äää':'ööö'} 

当您输入 print mydict 时,您将获得 {'\xc3\xa4\xc3\xa4\xc3\xa4':'\xc3\xb6\xc3\xb6\xc3\xb6','hei\xc3\\ \\ x9fe':'hei\xc3\x9fe'} 。甚至 print mydict ['äää'] 不起作用:它会导致类似于├Â├Â├ 。通过尝试打印类型(mydict ['äää'])显示问题的性质,这将告诉您您正在处理字符串对象



为了解决这个问题,您首先需要将源代码文件的字符集中的字符串表示形式解码为unicode对象,然后在您的终端的字符集中表示。对于个人dict项目,这可以通过以下方式来实现:

  print unicode(mydict,'utf-8')

请注意,如果默认编码不适用于您的终端,则需要写入:

  print unicode(mydict,'utf-8')。encode('utf-8')

外部编码方法根据您的终端指定编码。



我真的很鼓励你阅读通过Joel的绝对最小的每个软件开发人员绝对必须了解Unicode和字符集(没有借口! )。除非你了解字符集的工作原理,否则你会一再遇到类似的问题。


I have a python dictionary which contains items that have non-english characters. When I print the dictionary, the python shell does not properly display the non-english characters. How can I fix this?

解决方案

When your application prints hei\xdfen instead of heißen, it means you are not actually printing the actual unicode string, but instead, on the string representation of the unicode object.

Let us assume your string ("heißen") is stored into variable called text. Just to make sure where you are at, check out the type of this variable by calling:

>>> type(text)

If you get <type 'unicode'>, it means you are not dealing with a string, but instead a unicode object.

If you do the intuive thing and try to print to text by invoking print(text) you won't get out the actual text ("heißen") but instead, a string representation of a unicode object.

To fix this, you need to know which encoding your terminal has and print out your unicode object encoded according to the given encoding.

For instance, if your terminal uses UTF-8 encoding, you can print out a string by invoking:

text.encode('utf-8')

That's for the basic concepts. Now let me give you a more detailed example. Let us assume we have a source code file storing your dictionary. Like:

mydict = {'heiße': 'heiße', 'äää': 'ööö'}

When you type print mydict you will get {'\xc3\xa4\xc3\xa4\xc3\xa4': '\xc3\xb6\xc3\xb6\xc3\xb6', 'hei\xc3\x9fe': 'hei\xc3\x9fe'}. Even print mydict['äää'] doesn't work: it results in something like ├Â├Â├Â. The nature of the problem is revealed by trying out print type(mydict['äää']) which will tell you that you are dealing with a string object.

In order to fix the problem, you first need to decode the string representation from your source code file's charset to unicode object and then represent it in the charset of your terminal. For individual dict items this can be achived by:

print unicode(mydict, 'utf-8')

Note that if default encoding doesn't apply to your terminal, you need to write:

print unicode(mydict, 'utf-8').encode('utf-8')

Where the outer encode method specifies the encoding according to your terminal.

I really really urge you to read through Joel's "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)". Unless you understand how character sets work, you will stumble across problems similar to this again and again.

这篇关于如何在python中显示非英文字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆